Add `accelerate` support for M2M100 #19912

younesbelkada · 2022-10-26T22:42:29Z

What does this PR do?

This PR adds accelerate support to M2M100, therefore this enables loading NLLB models in 8-bit using load_in_8bit=True.

This might contain a breaking change but I am not sure.
When initializing the model in the meta device using accelerate the module self.shared is intialized and set to the correct device using set_tensor_to_device thrice - since it is shared by 3 modules (base model, encoder, decoder) - so it somehow ends up being on the meta device.
Therefore manually assigning a new module with the weights that correspond to the weights of the shared module should do the trick. But I am wondering if this is a breaking change since the shared module of the Encoder & Decoder won't be "shared" anymore. It should not be a problem at inference time, but can be problematic when training the model.

cc @sgugger

Also I know T5 also supports accelerate and uses shared embeddings. The only difference I see from both implementations are the _keys_to_ignore_on_load_missing that contains the shared weights for T5 and doesn't contain the shared weights for M2M100

HuggingFaceDocBuilderDev · 2022-10-26T22:54:54Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

LGTM, thanks for fixing!

sgugger · 2022-10-27T15:20:42Z

src/transformers/models/m2m_100/modeling_m2m_100.py

+        if embed_pos.device != inputs_embeds.device:
+            embed_pos = embed_pos.to(inputs_embeds.device)


I don't think you need the test, it's already done inside the to method to default to a noop :-)

sgugger · 2022-10-27T15:20:51Z

src/transformers/models/m2m_100/modeling_m2m_100.py

+        if positions.device != inputs_embeds.device:
+            positions = positions.to(inputs_embeds.device)


add accelerate support for M2M100

f863d35

younesbelkada requested a review from sgugger October 26, 2022 22:42

sgugger approved these changes Oct 27, 2022

View reviewed changes

fix device set nit

208e9e6

younesbelkada force-pushed the add_m2m100_accelerate branch from a2171bf to 208e9e6 Compare October 27, 2022 15:37

younesbelkada merged commit d56d723 into huggingface:main Oct 27, 2022

younesbelkada mentioned this pull request Oct 27, 2022

Add accelerate support for BART-like models #19927

Merged

pszemraj mentioned this pull request Nov 21, 2022

Add accelerate support for LongT5 models #20341

Merged

younesbelkada mentioned this pull request Dec 9, 2022

[ViTHybrid] fix last accelerate slow test #20705

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `accelerate` support for M2M100 #19912

Add `accelerate` support for M2M100 #19912

Uh oh!

younesbelkada commented Oct 26, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Oct 26, 2022 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

sgugger Oct 27, 2022

Uh oh!

sgugger Oct 27, 2022

Uh oh!

younesbelkada Oct 27, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if embed_pos.device != inputs_embeds.device:
		embed_pos = embed_pos.to(inputs_embeds.device)

		if positions.device != inputs_embeds.device:
		positions = positions.to(inputs_embeds.device)

Add accelerate support for M2M100 #19912

Add accelerate support for M2M100 #19912

Uh oh!

Conversation

younesbelkada commented Oct 26, 2022

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger Oct 27, 2022

Choose a reason for hiding this comment

Uh oh!

sgugger Oct 27, 2022

Choose a reason for hiding this comment

Uh oh!

younesbelkada Oct 27, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add `accelerate` support for M2M100 #19912

Add `accelerate` support for M2M100 #19912

HuggingFaceDocBuilderDev commented Oct 26, 2022 •

edited

Loading