|
|
--- |
|
|
library_name: pytorch |
|
|
pipeline_tag: image-classification |
|
|
license: mit |
|
|
tags: |
|
|
- automl |
|
|
- pytorch |
|
|
- torchvision |
|
|
- optuna |
|
|
- early-stopping |
|
|
model_name: Tomato vs Not-Tomato β AutoML (Compact CNN / Transfer Learning) |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
# Tomato vs Not-Tomato β AutoML (Compact NN) |
|
|
|
|
|
## Purpose |
|
|
Course assignment to practice AutoML for neural networks on a small, real dataset. |
|
|
We train a compact image classifier to predict whether an image **is a tomato (1) or not (0).** |
|
|
|
|
|
## Dataset |
|
|
- **Source:** classmate dataset on Hugging Face β `Iris314/Food_tomatoes_dataset` |
|
|
- **Task:** Binary classification (`0 = not_tomato`, `1 = tomato`) |
|
|
- **Splits:** Stratified **60/20/20** (train/val/test) created in the notebook |
|
|
- **Size:** ~30 images total (very small) |
|
|
- **Input resolution:** 224Γ224 |
|
|
|
|
|
## Preprocessing & Augmentation |
|
|
- **Normalization:** mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225] |
|
|
- **Train augmentations:** RandomResizedCrop, HorizontalFlip(0.5), ColorJitter |
|
|
- **Eval transforms:** Resize β CenterCrop β Normalize |
|
|
|
|
|
## AutoML Setup |
|
|
- **Search framework:** Optuna (budgeted search with pruning) |
|
|
- **Architectures:** `smallcnn` (from scratch), `resnet18`, `mobilenet_v3_small` |
|
|
- **Hyperparams:** optimizer β {adamw, sgd}, lr β [1e-5, 5e-3] (log), weight_decay β [1e-6, 1e-2] (log), |
|
|
dropout β [0, 0.6], batch_size β {8, 12, 16}, `freeze_backbone` β {True, False} (for pretrained) |
|
|
- **Early stopping:** patience = 6 epochs on validation F1 |
|
|
- **Budget:** 10 trials, max 20 epochs per trial, ~5 min wall-clock |
|
|
- **Seed:** 42 |
|
|
- **Compute:** Google Colab GPU runtime |
|
|
|
|
|
## Best Model & Hyperparameters |
|
|
```json |
|
|
{ |
|
|
"arch": "mobilenet_v3_small", |
|
|
"freeze_backbone": false, |
|
|
"dropout": 0.4761270681732692, |
|
|
"optimizer": "adamw", |
|
|
"lr": 1.1860369117967872e-05, |
|
|
"weight_decay": 0.00043282443346186894, |
|
|
"batch_size": 16 |
|
|
} |
|
|
``` |
|
|
|
|
|
## Results on Held out Test |
|
|
accuracy: 0.83, f1: 0.80 |
|
|
|
|
|
## Training curves and Early Stopping |
|
|
Validation F1 was tracked each epoch with patience = 6. Training stopped once performance plateaued, preventing overfitting. |
|
|
|
|
|
## Reproducability |
|
|
- Seed: 42 |
|
|
- Python: 3.12 |
|
|
- PyTorch: 2.4.1 |
|
|
- TorchVision: 0.19.1 |
|
|
- Optuna: 4.0.0 |
|
|
- Compute: Google Colab GPU (T4) |
|
|
|
|
|
## Limitations & Known Failure Modes |
|
|
- Extremely small dataset β risk of overfitting and unstable metrics. |
|
|
- Backgrounds and lighting variations can bias predictions. |
|
|
- Out-of-distribution images (e.g., tomato cartoons, extreme angles) may fail. |
|
|
|
|
|
## Ethics |
|
|
- This model is for coursework demonstration only; not for production or consequential decisions. |
|
|
|
|
|
## License |
|
|
- Code & weights: MIT (adjust per course requirements) |
|
|
- Dataset: follow the original datasetβs license/terms |
|
|
|
|
|
## Acknowledgments |
|
|
- Dataset: Iris314/Food_tomatoes_dataset |
|
|
- AutoML: Optuna |
|
|
- Backbones: torchvision models |
|
|
- Trained in Google Colab |
|
|
- GenAI tools assisted with boilerplate organization and documentation |
|
|
|