Skip DeepSpeed ZeRO Stage 3 model initialization when bnb by eljandoubi · Pull Request #34395 · huggingface/transformers

eljandoubi · 2024-10-24T20:40:40Z

What does this PR do?

Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized.

Fixes #34378

Models:

text models: @ArthurZucker
vision models: @amyeroberts, @qubvel

Integrations:

deepspeed: HF Trainer/Accelerate: @muellerzr
quantization (bitsandbytes, autogpt): @SunMarc @MekkCyber

…to be quantized.

SunMarc

Thanks for the PR ! Left a few comments

SunMarc · 2024-10-28T02:20:05Z

src/transformers/models/paligemma/modeling_paligemma.py

+        if hasattr(config, "quantization_config"):
+            vision_config.quantization_config = config.quantization_config
+            text_config.quantization_config = config.quantization_config


We don't want to have this in the model config. Otherwise, we would have to do it for every model. Also, we shouldn't need to do that as we quantize the model from the top level. Maybe we can propagate the quantization_config in from_pretrained ?

quantization_config is already passed to the model class via from_pretrained, but the sub-models are instantiated using from_config, which does not include it. Perhaps we can propagate the quantization information using a context manager.

SunMarc · 2024-10-28T02:34:17Z

src/transformers/modeling_utils.py

+        is_quantized = hasattr(config, "quantization_config")
+
+        if is_deepspeed_zero3_enabled() and not is_quantized:


Not a huge fan of checking the quantization_config here as we don't really quantize the model with from_config. However, I'm not sure if there is an easier solution.Another solution would be to pass an arg in kwargs that we will pop.

If we pass an argument to flag quantization, it would require changes in every composed model.

eljandoubi · 2024-10-29T16:37:12Z

@SunMarc @muellerzr What do you think of this solution?

muellerzr

Thanks I think this solution looks quite nice. cc @ArthurZucker

HuggingFaceDocBuilderDev · 2024-10-30T22:45:48Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SunMarc

Works for me ! Thanks for iterating and coming up with this solution !

LysandreJik

Thank you!

…e#34395) * Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended to be quantized. * Propagate the quantization state using a context manager * make fixup

eljandoubi and others added 6 commits October 24, 2024 21:20

Skip DeepSpeed ZeRO Stage 3 model initialization when it is intended …

a877f94

…to be quantized.

Merge branch 'huggingface:main' into fix_paligemma_bnb_deepspeed

c378df4

Merge branch 'main' into fix_paligemma_bnb_deepspeed

9ee74fe

Merge branch 'main' into fix_paligemma_bnb_deepspeed

94c2f04

Merge branch 'main' into fix_paligemma_bnb_deepspeed

c9009da

Merge branch 'main' into fix_paligemma_bnb_deepspeed

4d8a26f

SunMarc reviewed Oct 28, 2024

View reviewed changes

SunMarc requested a review from muellerzr October 28, 2024 02:34

eljandoubi and others added 6 commits October 28, 2024 12:27

Merge branch 'main' into fix_paligemma_bnb_deepspeed

56c4aa0

Merge branch 'main' into fix_paligemma_bnb_deepspeed

e1f9632

Merge branch 'main' into fix_paligemma_bnb_deepspeed

b66c79e

Propagate the quantization state using a context manager

6e84404

make fixup

565fe97

Merge branch 'huggingface:main' into fix_paligemma_bnb_deepspeed

8352191

eljandoubi added 2 commits October 30, 2024 15:47

Merge branch 'huggingface:main' into fix_paligemma_bnb_deepspeed

55bfe6f

Merge branch 'main' into fix_paligemma_bnb_deepspeed

ffb42fe

muellerzr approved these changes Oct 30, 2024

View reviewed changes

muellerzr requested review from ArthurZucker and SunMarc October 30, 2024 22:20

eljandoubi added 6 commits October 31, 2024 17:26

Merge branch 'main' into fix_paligemma_bnb_deepspeed

5859d14

Merge branch 'main' into fix_paligemma_bnb_deepspeed

9ef02e4

Merge branch 'main' into fix_paligemma_bnb_deepspeed

1ae5fb1

Merge branch 'main' into fix_paligemma_bnb_deepspeed

9544a48

Merge branch 'main' into fix_paligemma_bnb_deepspeed

1ad3778

Merge branch 'main' into fix_paligemma_bnb_deepspeed

5a0fac1

SunMarc approved these changes Nov 4, 2024

View reviewed changes

LysandreJik approved these changes Nov 5, 2024

View reviewed changes

LysandreJik merged commit d0b1d8d into huggingface:main Nov 5, 2024

ArthurZucker requested review from SunMarc and removed request for ArthurZucker November 5, 2024 10:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip DeepSpeed ZeRO Stage 3 model initialization when bnb#34395

Skip DeepSpeed ZeRO Stage 3 model initialization when bnb#34395
LysandreJik merged 20 commits intohuggingface:mainfrom
eljandoubi:fix_paligemma_bnb_deepspeed

eljandoubi commented Oct 24, 2024

Uh oh!

SunMarc left a comment

Uh oh!

SunMarc Oct 28, 2024

Uh oh!

eljandoubi Oct 28, 2024

Uh oh!

SunMarc Oct 28, 2024

Uh oh!

eljandoubi Oct 28, 2024

Uh oh!

eljandoubi commented Oct 29, 2024 •

edited

Loading

Uh oh!

muellerzr left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 30, 2024

Uh oh!

SunMarc left a comment

Uh oh!

LysandreJik left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		is_quantized = hasattr(config, "quantization_config")

		if is_deepspeed_zero3_enabled() and not is_quantized:

Conversation

eljandoubi commented Oct 24, 2024

What does this PR do?

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

SunMarc Oct 28, 2024

Choose a reason for hiding this comment

Uh oh!

eljandoubi Oct 28, 2024

Choose a reason for hiding this comment

Uh oh!

SunMarc Oct 28, 2024

Choose a reason for hiding this comment

Uh oh!

eljandoubi Oct 28, 2024

Choose a reason for hiding this comment

Uh oh!

eljandoubi commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

muellerzr left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 30, 2024

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

eljandoubi commented Oct 29, 2024 •

edited

Loading