Skip to content

Conversation

@NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Dec 2, 2022

What does this PR do?

This PR adds ViT hybrid to the library. As ViT hybrid uses BiT as backbone, this PR also adds BiT as a standalone model.

BiT itself is very similar to a ResNetv2, except that it replaces batch norm layers by group norm and uses "weight standardized" convolutional layers.

To do:

  • add image processors
  • add tests for image processors (cc @amyeroberts can I directly add test_modeling_image_processor_xxx.py ?)
  • transfer all checkpoints
  • add integration tests

Comment on lines 57 to 58
# Copied from transformers.models.vit.modeling_vit.ViTEmbeddings with ViT->ViTHybrid
class ViTHybridEmbeddings(nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove the Copied from to address https://github.com/huggingface/transformers/pull/20550/files#r1039877563 / let's keep this in mind

@younesbelkada younesbelkada requested a review from sgugger December 5, 2022 19:06
@younesbelkada younesbelkada mentioned this pull request Dec 6, 2022
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some problems with the inconsistent checkpoint names, and the model type for ViTHybridConfig. Also make sure the actual more repos are in the right places on the Hub.

This is the configuration class to store the configuration of a [`BitModel`]. It is used to instantiate an BiT
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the BiT
[google/resnetnv2-50](https://huggingface.co/google/resnetnv2-50) architecture.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be updated to google/bit-50



VIT_HYBRID_PRETRAINED_MODEL_ARCHIVE_LIST = [
"google/vit-base-r50-s16-384",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, checkpoint needs to be the vit-hybrid one.

@younesbelkada
Copy link
Contributor

younesbelkada commented Dec 6, 2022

Thanks so much @sgugger for your review !
I should have updated everything and the main models are now up:

@younesbelkada younesbelkada merged commit d151a8c into huggingface:main Dec 7, 2022
mpierrau pushed a commit to mpierrau/transformers that referenced this pull request Dec 15, 2022
* First draft

* More improvements

* Add backbone, first draft of ViT hybrid

* Add AutoBackbone

* More improvements

* Fix bug

* More improvements

* More improvements

* Convert ViT-hybrid

* More improvements

* add patch bit

* Fix style

* Improve code

* cleaned v1

* more cleaning

* more refactoring

* Improve models, add tests

* Add docs and tests

* Make more tests pass

* Improve default backbone config

* Update model_type

* Fix more tests

* Add more copied from statements

* More improvements

* Add push to hub to conversion scripts

* clean

* more cleanup

* clean

* replace to

* fix

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <[email protected]>

* fix base model prefix

* more cleaning

* get rid of stem

* clean

* replace flag

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <[email protected]>

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <[email protected]>

* add check

* another check

* fix for hybrid vit

* final fix

* update config

* fix class name

* fix `make fix-copies`

* remove `use_activation`

* Update src/transformers/models/bit/configuration_bit.py

* rm unneeded file

* Add BiT image processor

* rm unneeded file

* add doc

* Add image processor to conversion script

* Add ViTHybrid image processor

* Add resources

* Move bit to correct position

* Fix auto mapping

* Rename hybrid to Hybrid

* Fix name in toctree

* Fix READMEs'

* Improve config

* Simplify GroupNormActivation layer

* fix test + make style

* Improve config

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* remove comment

* remove comment

* replace

* replace

* remove all conv_layer

* refactor norm_layer

* revert x

* add copied from

* last changes + integration tests

* make fixup

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* fix name

* fix message

* remove assert and refactor

* refactor + make fixup

* refactor - add  + sfety checker

* fix docstring + checkpoint names

* fix merge issues

* fix function name

* fix copies

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* fix model checkpoint

* fix doctest output

* vit name on doc

* fix name on doc

* fix small nits

* fixed integration tests

* final changes - slow tests pass

Co-authored-by: Niels Rogge <[email protected]>
Co-authored-by: younesbelkada <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants