First Steps

The FGVC library contains useful methods and CLI scripts for training and fine-tuning image-based deep neural networks in PyTorch and logging results to W&B.

The library allows to train models using:

  1. A default CLI script fgvc train [...], which is useful for quick experiments with little customization.

  2. A custom script python train.py [...] that uses FGVC modules like training and experiment. This option is useful for including modifications like custom loss functions or custom steps in a training loop. We suggest creating custom train.py script by copying and modifying the CLI training script.

The library is designed with “easy-to-experiment” design in mind. This means that the main components like ClassificationTrainer, that implements training loop, can be replaced with a custom implementation. The library simplifies implementing custom Trainer class by providing helper methods and mixins in modules like training_utils, TrainingState, scores_monitor, SchedulerMixin, and MixupMixin.

For each project, we suggest to create the following experiment file structure:

.
├── configs                                     # directory with configuration files for diffferent experiment runs
│   ├── vit_base_patch32_224.yaml
│   └── vit_base_patch32_384.yaml
├── sweeps                                      # (optional) directory with W&B sweep configuration files 
│   └── init_sweep.yaml
├── requirements.txt                            # txt file with python dependencies such as FGVC 
├── train.ipynb                                 # jupyter notebook that calls training or optionally sweep scripts
└── train.py                                    # (optional) training script with custom modifications

Having training (and optionally hyperparameter tuning) configurations stored in YAML configs, dependency versions in requirements.txt, and execution steps in train.ipynb notebooks helps to document and reproduce experiments.

Configuration File

The configuration YAML file specifies parameters for training. Example file configs/vit_base_patch32_224.yaml:

# data
augmentations: 'light'
image_size: [224, 224]  # [height, width]
dataset: 'DF20-Mini'

# model
architecture: 'vit_base_patch32_224'

# training
loss: 'CrossEntropyLoss'
optimizer: 'SGD'
scheduler: 'cosine'
epochs: 40
learning_rate: 0.001
batch_size: 64
accumulation_steps: 1

# other
random_seed: 777
workers: 4
multigpu: False
tags: ['baseline']  # W&B Run tags
root_path: "/data/experiments/Danish-Fungi/"

These parameters are used by default by FGVC methods in experiment module. Implementing custom train.py script allows to include additional configuration parameters.

Rewriting Configuration Parameters

Parameters in the configuration file can be rewritten by script parameters, for example:

fgvc train \
  --config-path configs/vit_base_patch32_224.yaml \
  --architecture vit_large_patch16_224 \
  --epochs 100 \
  --root-path /data/experiments/Danish-Fungi/

This functionality is useful when running W&B Sweeps or when calling training script multiple times with a slightly different configuration.

Note, that the script parameter root-path will be replaced by the script with root_path. Configuration parameters should always contain _ instead of - character because of potential parsing issues.

Training

The library allows to train models using:

  1. A default CLI script fgvc train [...], which is useful for quick experiments with little customization.

  2. A custom script python train.py [...] that uses FGVC modules like training and experiment. This option is useful for including modifications like custom loss functions or custom steps in a training loop. We suggest creating custom train.py script by copying and modifying cli training script.

CLI Script

Run the following command to train a model based on configs/vit_base_patch32_224.yaml configuration file:

fgvc train \
    --train-metadata ./DanishFungi2020-Mini_train_metadata_DEV.csv \
    --valid-metadata ./DanishFungi2020-Mini_test_metadata_DEV.csv \
    --config-path configs/vit_base_patch32_224.yaml \
    --wandb-entity chamidullinr \
    --wandb-project FGVC-test

Input metadata files (DanishFungi2020-Mini_train_metadata_DEV.csv and DanishFungi2020-Mini_test_metadata_DEV.csv) are passed to ImageDataset class in datasets module. The class expects metadata files to have image_path and class_id columns. For custom functionality like different metadata formats, we suggest implementing custom train.py script.

W&B related script arguments --wandb-entity and --wandb-project are optional.

The script creates experiment directory ./runs/{run_name}/{exp_name} and stores files:

  • training.log file with training scores for each epoch,

  • best_loss.pth checkpoint with weights in epoch that had the best validation loss.

  • best_[score].pth checkpoint with weights in epoch that had the best validation score like F1 or Accuracy.

  • checkpoint.pth.tar checkpoint with optimizer and scheduler state for resuming the training. The checkpoint is removed when training finishes.

The files are created and managed by TrainingState class.

Custom Script

Run the custom train.py script to train model based on configs/vit_base_patch32_224.yaml configuration:

python train.py \
    --config-path configs/vit_base_patch32_224.yaml \
    --wandb-entity chamidullinr \
    --wandb-project FGVC-test

Note, reading input CSV files (e.g. DanishFungi2020-Mini_train_metadata_DEV.csv and DanishFungi2020-Mini_test_metadata_DEV.csv) can be included directly in train.py script, if the same metadata files are used for all experiments.