-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Add BiT + ViT hybrid #20550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BiT + ViT hybrid #20550
Conversation
| # Copied from transformers.models.vit.modeling_vit.ViTEmbeddings with ViT->ViTHybrid | ||
| class ViTHybridEmbeddings(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to remove the Copied from to address https://github.com/huggingface/transformers/pull/20550/files#r1039877563 / let's keep this in mind
sgugger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some problems with the inconsistent checkpoint names, and the model type for ViTHybridConfig. Also make sure the actual more repos are in the right places on the Hub.
| This is the configuration class to store the configuration of a [`BitModel`]. It is used to instantiate an BiT | ||
| model according to the specified arguments, defining the model architecture. Instantiating a configuration with the | ||
| defaults will yield a similar configuration to that of the BiT | ||
| [google/resnetnv2-50](https://huggingface.co/google/resnetnv2-50) architecture. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be updated to google/bit-50
|
|
||
|
|
||
| VIT_HYBRID_PRETRAINED_MODEL_ARCHIVE_LIST = [ | ||
| "google/vit-base-r50-s16-384", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, checkpoint needs to be the vit-hybrid one.
|
Thanks so much @sgugger for your review ! |
* First draft * More improvements * Add backbone, first draft of ViT hybrid * Add AutoBackbone * More improvements * Fix bug * More improvements * More improvements * Convert ViT-hybrid * More improvements * add patch bit * Fix style * Improve code * cleaned v1 * more cleaning * more refactoring * Improve models, add tests * Add docs and tests * Make more tests pass * Improve default backbone config * Update model_type * Fix more tests * Add more copied from statements * More improvements * Add push to hub to conversion scripts * clean * more cleanup * clean * replace to * fix * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <[email protected]> * fix base model prefix * more cleaning * get rid of stem * clean * replace flag * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <[email protected]> * Update src/transformers/models/bit/configuration_bit.py Co-authored-by: NielsRogge <[email protected]> * add check * another check * fix for hybrid vit * final fix * update config * fix class name * fix `make fix-copies` * remove `use_activation` * Update src/transformers/models/bit/configuration_bit.py * rm unneeded file * Add BiT image processor * rm unneeded file * add doc * Add image processor to conversion script * Add ViTHybrid image processor * Add resources * Move bit to correct position * Fix auto mapping * Rename hybrid to Hybrid * Fix name in toctree * Fix READMEs' * Improve config * Simplify GroupNormActivation layer * fix test + make style * Improve config * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * remove comment * remove comment * replace * replace * remove all conv_layer * refactor norm_layer * revert x * add copied from * last changes + integration tests * make fixup * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * fix name * fix message * remove assert and refactor * refactor + make fixup * refactor - add + sfety checker * fix docstring + checkpoint names * fix merge issues * fix function name * fix copies * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * fix model checkpoint * fix doctest output * vit name on doc * fix name on doc * fix small nits * fixed integration tests * final changes - slow tests pass Co-authored-by: Niels Rogge <[email protected]> Co-authored-by: younesbelkada <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>
What does this PR do?
This PR adds ViT hybrid to the library. As ViT hybrid uses BiT as backbone, this PR also adds BiT as a standalone model.
BiT itself is very similar to a ResNetv2, except that it replaces batch norm layers by group norm and uses "weight standardized" convolutional layers.
To do: