Dense Models
EfficientNet-ES (EdgeTPU-Small) and EfficientNet-ES (EdgeTPU-Large) are trained with 8 Quadro RTX 8000 using pytorch-image-models repo.
The training scripts with hyper-params are
./distributed_train.sh 8 /imagenet --model efficientnet_es -b 128 --sched step --epochs 450 --decay-epochs 2.4 --decay-rate .97 --opt rmsproptf --opt-eps .001 -j 8 --warmup-lr 1e-6 --weight-decay 1e-5 --drop 0.2 --drop-connect 0.2 --aa rand-m9-mstd0.5 --remode pixel --reprob 0.2 --amp --lr .064
./distributed_train.sh 8 /imagenet --model efficientnet_el -b 128 --sched step --epochs 450 --decay-epochs 2.4 --decay-rate .97 --opt rmsproptf --opt-eps .001 -j 8 --warmup-lr 1e-6 --weight-decay 1e-5 --drop 0.2 --drop-connect 0.2 --aa rand-m9-mstd0.5 --remode pixel --reprob 0.2 --amp --lr .064
Pruned Models
The pruning is done by use of the DG_Prune submodule. The code is provided at my forked pytorch-image-models in DG branch.
The pruning is done using the lottery ticket hypothesis (LTH) algorithm. The pruning hyperparameters are provided in the json files attached. The training hyperparameters are exactly the same as the dense training.
Results
Model | Top1 Acc | Top5 Acc |
---|---|---|
EfficientNet-ES | 77.906 | 94.038 |
EfficientNet-ES Pruned | 75.060 | 92.438 |
EfficientNet-EL | 81.296 | 95.562 |
EfficientNet-EL Pruned | 80.318 | 95.212 |