-
Notifications
You must be signed in to change notification settings - Fork 157
Description
Hi VideoMAE team :)
I've implemented VideoMAE as a fork of 🤗 HuggingFace Transformers, and I'm going to add it soon to the library (see huggingface/transformers#17821). Here's a notebook that illustrates inference with it: https://colab.research.google.com/drive/1ZX_XnM0ol81FbcxrFS3nNLkmn-0fzvQk?usp=sharing
The reason I'm adding VideoMAE is because I really like the simplicity of it, it was literally a single line of code change from ViT (nn.Conv2d -> nn.Conv3d).
As you may or may not know, any model on the HuggingFace hub has its own Github repository. E.g. the VideoMAE-base checkpoint fine-tuned on Kinetics-400 can be found here: https://huggingface.co/nielsr/videomae-base. If you check the "files and versions" tab, it includes the weights. The model hub uses git-LFS (large file storage) to use Git with large files such as model weights. This means that any model has its own Git commit history!
A model card can also be added to the repo, which is just a README.
Are you interested in creating an organization on the hub, such that we can store all model checkpoints there (rather than under my user name)?
Let me know!
Kind regards,
Niels
ML Engineer @ HuggingFace