[training] fixes to the quantization training script and add AdEMAMix optimizer as an option #9806

sayakpaul · 2024-10-30T05:35:44Z

What does this PR do?

Adds AdEMAMix: https://hf.co/papers/2409.03137.
Multiple fixes and small nits.

sayakpaul · 2024-10-30T05:36:31Z

examples/research_projects/flux_lora_quantization/train_dreambooth_lora_flux_miniature.py

@@ -1059,7 +1076,7 @@ def get_sigmas(timesteps, n_dim=4, dtype=torch.float32):
                )

                # handle guidance
-                if transformer.config.guidance_embeds:
+                if unwrap_model(transformer).config.guidance_embeds:


So that things are compatible with DeepSpeed.

sayakpaul · 2024-10-30T05:36:52Z

examples/research_projects/flux_lora_quantization/train_dreambooth_lora_flux_miniature.py

+                vae_scale_factor = 2 ** (len(vae_config_block_out_channels) - 1)

                latent_image_ids = FluxPipeline._prepare_latent_image_ids(
                    model_input.shape[0],
-                    model_input.shape[2],
-                    model_input.shape[3],
+                    model_input.shape[2] // 2,
+                    model_input.shape[3] // 2,
                    accelerator.device,
                    weight_dtype,


Follows what we do in the Flux LoRA scripts.

sayakpaul · 2024-10-30T05:37:02Z

examples/research_projects/flux_lora_quantization/train_dreambooth_lora_flux_miniature.py

+                    height=model_input.shape[2] * vae_scale_factor,
+                    width=model_input.shape[3] * vae_scale_factor,


Same as above.

HuggingFaceDocBuilderDev · 2024-10-30T05:42:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

linoytsaban

Cool! did you get a chance to play with AdEMAMix? should we consider adding it to the other scripts as well?

examples/research_projects/flux_lora_quantization/train_dreambooth_lora_flux_miniature.py

sayakpaul · 2024-10-30T09:07:51Z

Cool! did you get a chance to play with AdEMAMix? should we consider adding it to the other scripts as well?

Testing the memory requirement as we speak. Will report back.

linoytsaban · 2024-10-30T12:01:05Z

examples/research_projects/flux_lora_quantization/train_dreambooth_lora_flux_miniature.py

+        else:
+            optimizer_class = bnb.optim.AdEMAMix
+
+        optimizer = optimizer_class(params_to_optimize)


should we support betas and weight_decay here?
We could use the existing args like we did for prodigy, i.e.

optimizer = optimizer_class(params_to_optimize, betas=(args.adam_beta1, args.adam_beta2), weight_decay=args.adam_weight_decay)

Umm I didn't want to actually to keep the separations of concern very clear. We could maybe revisit if the community finds the optimizer worth the go?

sayakpaul · 2024-10-30T14:58:42Z

@linoytsaban seems identical to me:
https://wandb.ai/sayakpaul/dreambooth-flux-dev-lora-nf4/

Let's maybe wait a bit before we propagate this to canonical scripts? Meanwhile can I merge this PR?

Internal thread: https://huggingface.slack.com/archives/C04NNCRFYUQ/p1730299681912129

Optimizer	Visualization
Adam
AdEMAMix

With

...

image = pipeline(
    "a puppy in a pond, yarn art style", num_inference_steps=28, guidance_scale=3.5, height=768
).images[0]

linoytsaban · 2024-10-31T10:13:13Z

@sayakpaul yeah sounds good to me :) let's 🛳️

… optimizer as an option (#9806) * fixes * more fixes.

sayakpaul added 2 commits October 30, 2024 10:35

fixes

868c32e

more fixes.

985db62

sayakpaul requested a review from linoytsaban October 30, 2024 05:35

sayakpaul commented Oct 30, 2024

View reviewed changes

linoytsaban approved these changes Oct 30, 2024

View reviewed changes

linoytsaban reviewed Oct 30, 2024

View reviewed changes

examples/research_projects/flux_lora_quantization/train_dreambooth_lora_flux_miniature.py Show resolved Hide resolved

sayakpaul mentioned this pull request Oct 30, 2024

RuntimeError: The size of tensor a (4173) must match the size of tensor b (16461) at non-singleton dimension 2 #9799

Closed

linoytsaban reviewed Oct 30, 2024

View reviewed changes

Merge branch 'main' into fix-quantization-training

fcd37bc

Merge branch 'main' into fix-quantization-training

f522496

sayakpaul merged commit 09b8aeb into main Oct 31, 2024
11 checks passed

sayakpaul deleted the fix-quantization-training branch October 31, 2024 10:16

a-r-r-o-w pushed a commit that referenced this pull request Nov 1, 2024

[training] fixes to the quantization training script and add AdEMAMix…

3e12251

… optimizer as an option (#9806) * fixes * more fixes.

sayakpaul added a commit that referenced this pull request Dec 23, 2024

[training] fixes to the quantization training script and add AdEMAMix…

dfbe972

… optimizer as an option (#9806) * fixes * more fixes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[training] fixes to the quantization training script and add AdEMAMix optimizer as an option #9806

[training] fixes to the quantization training script and add AdEMAMix optimizer as an option #9806

sayakpaul commented Oct 30, 2024

sayakpaul Oct 30, 2024

sayakpaul Oct 30, 2024

sayakpaul Oct 30, 2024

HuggingFaceDocBuilderDev commented Oct 30, 2024

linoytsaban left a comment

sayakpaul commented Oct 30, 2024

linoytsaban Oct 30, 2024

sayakpaul Oct 30, 2024

sayakpaul commented Oct 30, 2024 •

edited

Loading

linoytsaban commented Oct 31, 2024

		height=model_input.shape[2] * vae_scale_factor,
		width=model_input.shape[3] * vae_scale_factor,

[training] fixes to the quantization training script and add AdEMAMix optimizer as an option #9806

[training] fixes to the quantization training script and add AdEMAMix optimizer as an option #9806

Conversation

sayakpaul commented Oct 30, 2024

What does this PR do?

sayakpaul Oct 30, 2024

Choose a reason for hiding this comment

sayakpaul Oct 30, 2024

Choose a reason for hiding this comment

sayakpaul Oct 30, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 30, 2024

linoytsaban left a comment

Choose a reason for hiding this comment

sayakpaul commented Oct 30, 2024

linoytsaban Oct 30, 2024

Choose a reason for hiding this comment

sayakpaul Oct 30, 2024

Choose a reason for hiding this comment

sayakpaul commented Oct 30, 2024 • edited Loading

linoytsaban commented Oct 31, 2024

sayakpaul commented Oct 30, 2024 •

edited

Loading