[`Vision`] [Refactor] Initialize weights on the correct place #20803

younesbelkada · 2022-12-16T14:46:25Z

What does this PR do?

This PR forces some modules to be initialised on the correct place (i.e. on the _init_weights method).
With more vision models being added, contributors are copying the practice to initialise some weights outside _init_weights. I think that we should centralize weights initialisation on the _init_weights method, by applying this on most-copied / downloaded models.

Add BLIP #20716 (comment)

- initialization on `_init_weights` - fix copies

HuggingFaceDocBuilderDev · 2022-12-16T15:03:18Z

The documentation is not available anymore as the PR was closed or merged.

NielsRogge

Thanks for making this cleaner!

src/transformers/models/vit_hybrid/modeling_vit_hybrid.py

ArthurZucker

LGTM, thanks for this, let's just add a copied from for the __init__ function

sgugger

Yes, this is way better to do it this way (otherwise using device_map="auto" or more generally _fast_init in from_pretrained results in those weights not being properly initialized.

sayakpaul · 2022-12-28T14:16:51Z

src/transformers/models/vit/modeling_vit.py

+        elif isinstance(module, ViTEmbeddings):
+            nn.init.trunc_normal_(
+                module.position_embeddings,
+                mean=0.0,
+                std=self.config.initializer_range,
+            )
+
+            nn.init.trunc_normal_(
+                module.cls_token,
+                mean=0.0,
+                std=self.config.initializer_range,
+            )


@younesbelkada do you mind filling me in a bit about this refactor? I would greatly appreciate it.

hey @sayakpaul !

I just moved the lines above here, before this refactor the attributes position_embeddings & cls_token were initialized on the fly during their initialization, i.e. whenever we create an instance of ViTEmbedding.

But this approach is not what we want to follow since it is preferable to centralize all the weight initialization process inside the method _init_weights, for example to call it only when we need it! Consider this snippet:

from transformers import ViTModel model = ViTModel.from_pretrained("google/vit-base-patch16-224-in21k")

I think what we want ideally here is to:
1- Create an instance of ViTModel
2- Load the pre-trained weights, straight to the goal
There is no need to fill the values of the weights with an intendent distribution here, since the weights will be filled at the end by the pre-trained weights. Also note that not calling _init_weights will speedup loading large models (this is I believe one of the main reason why it is disabled by default on .from_pretrained method).

Therefore that is why we always prefer to do it in two stages:
1- initialize each module and submodules
2- fill the weights of these modules with the correct distribution if needed
In rare cases you can face unexpected behaviours when doing everything in 1- : e.g. before this PR if you load a ViT model with & torch_dtype=torch.float16 you'll face an error:

RuntimeError: "erfinv_vml_cpu" not implemented for 'Half'

That you can reproduce with this snippet:

import torch import torch.nn as nn torch.set_default_dtype(torch.float16) nn.init.trunc_normal_( torch.zeros(1, 1, 2), mean=0.0, std=0.1, )

This is simply because torch.set_default_dtype(torch.float16) is called when adding torch_dtype=torch.float16, and such errors can be very confusing for users!

Beauty! Thanks for being so generous with your explanation!

@younesbelkada thank you for tracking the issue

…gface#20803) * fix nit - initialization on `_init_weights` - fix copies * add copied from

fix nit

c3f7814

- initialization on `_init_weights` - fix copies

younesbelkada requested review from NielsRogge and sgugger December 16, 2022 14:46

younesbelkada changed the title ~~[Vision] Initialize weights on the correct place~~ [Vision] [Refactor] Initialize weights on the correct place Dec 16, 2022

NielsRogge approved these changes Dec 16, 2022

View reviewed changes

younesbelkada requested a review from ArthurZucker December 16, 2022 15:33

ArthurZucker reviewed Dec 16, 2022

View reviewed changes

src/transformers/models/vit_hybrid/modeling_vit_hybrid.py Show resolved Hide resolved

ArthurZucker approved these changes Dec 16, 2022

View reviewed changes

add copied from

74e1a0f

sgugger approved these changes Dec 19, 2022

View reviewed changes

younesbelkada merged commit ecd7de3 into huggingface:main Dec 19, 2022

sayakpaul reviewed Dec 28, 2022

View reviewed changes

silverriver pushed a commit to silverriver/transformers that referenced this pull request Jan 6, 2023

[Vision] [Refactor] Initialize weights on the correct place (huggin…

5541404

…gface#20803) * fix nit - initialization on `_init_weights` - fix copies * add copied from

younesbelkada mentioned this pull request Jan 19, 2023

[CVT] Fix module initialization issue #21193

Merged

fxmarty mentioned this pull request Feb 24, 2023

Fix nn.init.trunc_normal_ call on torch.float16 data #21789

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`Vision`] [Refactor] Initialize weights on the correct place #20803

[`Vision`] [Refactor] Initialize weights on the correct place #20803

Uh oh!

younesbelkada commented Dec 16, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Dec 16, 2022 •

edited

Loading

Uh oh!

NielsRogge left a comment

Uh oh!

Uh oh!

ArthurZucker left a comment

Uh oh!

sgugger left a comment

Uh oh!

sayakpaul Dec 28, 2022

Uh oh!

younesbelkada Dec 29, 2022 •

edited

Loading

Uh oh!

sayakpaul Dec 30, 2022

Uh oh!

fxmarty Feb 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[Vision] [Refactor] Initialize weights on the correct place #20803

[Vision] [Refactor] Initialize weights on the correct place #20803

Uh oh!

Conversation

younesbelkada commented Dec 16, 2022

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NielsRogge left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul Dec 28, 2022

Choose a reason for hiding this comment

Uh oh!

younesbelkada Dec 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sayakpaul Dec 30, 2022

Choose a reason for hiding this comment

Uh oh!

fxmarty Feb 24, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[`Vision`] [Refactor] Initialize weights on the correct place #20803

[`Vision`] [Refactor] Initialize weights on the correct place #20803

HuggingFaceDocBuilderDev commented Dec 16, 2022 •

edited

Loading

younesbelkada Dec 29, 2022 •

edited

Loading