Adding VideoMAE to HuggingFace Transformers

Hi VideoMAE team :)

I've implemented VideoMAE as a fork of [🤗 HuggingFace Transformers](https://github.com/huggingface/transformers.git), and I'm going to add it soon to the library (see https://github.com/huggingface/transformers/pull/17821). Here's a notebook that illustrates inference with it: https://colab.research.google.com/drive/1ZX_XnM0ol81FbcxrFS3nNLkmn-0fzvQk?usp=sharing

The reason I'm adding VideoMAE is because I really like the simplicity of it, it was literally a single line of code change from ViT (`nn.Conv2d` -> `nn.Conv3d`).

As you may or may not know, any model on the HuggingFace [hub](https://huggingface.co/) has its own Github repository. E.g. the VideoMAE-base checkpoint fine-tuned on Kinetics-400 can be found here: https://huggingface.co/nielsr/videomae-base. If you check the "files and versions" tab, it includes the weights. The model hub uses git-LFS (large file storage) to use Git with large files such as model weights. This means that any model has its own Git [commit history](https://huggingface.co/nielsr/yolos-s/commits/main)!

A model card can also be added to the repo, which is just a README.

Are you interested in creating an organization on the hub, such that we can store all model checkpoints there (rather than under my user name)?

Let me know!

Kind regards,

Niels
ML Engineer @ HuggingFace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding VideoMAE to HuggingFace Transformers #23

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Adding VideoMAE to HuggingFace Transformers #23

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions