Fine-tuning Pre-trained MAE for Classification

Evaluation

As a sanity check, run evaluation using our ImageNet fine-tuned models:

	ViT-Large	ViT-Huge
pre-trained checkpoint on Kinetics-400	download	download
md5	`edf3a5`	`3d7f64`

	ViT-Large	ViT-Huge
pre-trained checkpoint on Kinetics-600	download	download
md5	`9a9645`	`27495e`

	ViT-Large	ViT-Huge
pre-trained checkpoint on Kinetics-700	download	download
md5	`cdbada`	`4c4e3c`

Evaluate ViT-Large: (${KINETICS_DIR} is a directory containing {train, val} sets of Kinetics):

python run_finetune.py --path_to_data_dir ${KINETICS_DIR} --rand_aug --epochs 50 --repeat_aug 2 --model vit_large_patch16 --batch_size 2 --distributed --dist_eval --smoothing 0.1 --mixup 0.8 --cutmix 1.0 --mixup_prob 1.0 --blr 0.0024 --num_frames 16 --sampling_rate 4 --dropout 0.3 --warmup_epochs 5 --layer_decay 0.75 --drop_path_rate 0.2 --aa rand-m7-mstd0.5-inc1 --clip_grad 5.0 --fp32"}${FINETUNE_APPENDIX}

This should give:

* Acc@1 84.35

Notes

The pre-trained models we provide are trained with normalized pixels --norm_pix_loss (1600 effective epochs). The models are pretrained in PySlowFast codebase.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FINETUNE.md

FINETUNE.md

Fine-tuning Pre-trained MAE for Classification

Evaluation

Notes

Files

FINETUNE.md

Latest commit

History

FINETUNE.md

File metadata and controls

Fine-tuning Pre-trained MAE for Classification

Evaluation

Notes