[LoRA] introduce `LoraBaseMixin` to promote reusability. #8670

sayakpaul · 2024-06-24T03:16:25Z

What does this PR do?

TL;DR: Introduce a LoraUBaseMixin class to reuse LoRA handling methods like set_adapters() across different pipelines as much as possible.

Long description

Currently, we have a LoraLoaderMixin class that implements several crucial methods for dealing with LoRAs. These methods include fuse_lora(), unfuse_lora(), set_adapters(), unload_lora(), disable_lora(), and so on.

However, this class is very rigid in the sense that it only applies to the Stable Diffusion family of models. This is because it relies on the unet component of an underlying pipeline. But we have started to see that Transformers are more and more used as the denoiser. Some popular examples include SD3, Hunyuan DiT, and PixArt-{Sigma,Alpha}. So, we cannot make use of the above methods directly for these pipelines.

There are two ways to deal with this:

Implement all the above-mentioned methods with the unet component replaced with transformer.
Introduce a new base class that can determine the right type of denoiser being used in a pipeline and perform necessary adjustments for us.

The PR takes the latter approach.

If we just factor in the code changes, we'd notice that this PR substantially reduces the LoCs, gracefully promotes reusability, and reduces the friction to incorporate a new pipeline that uses a non-UNet denoiser backbone.

Currently, we re-implement some methods in both UNet2DLoaderMixin and SD3TransformerLoaderMixin with the "# Copied from ..." mechanism. In a future PR, we can consider introducing a ModelLoaderMixin so that we can share methods like set_adapter() on the ModelMixin level. This should be relatively easy to do.

This PR will also make it easy to add LoRA support for other pipelines such as Hunyuan DiT, PixArt Sigma, etc. Broadly, all we have to do is implement the following methods:

load_lora_into_denoiser() where the denoiser can be a UNet or a Transformer
load_lora_weights()
Add LoRA support to the underlying denoiser classes through PeftLoaderMixin.

I am going to request a review from @BenjaminBossan first. Once things look good, I will request reviews from other maintainers.

HuggingFaceDocBuilderDev · 2024-06-24T03:23:09Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul · 2024-06-24T03:23:45Z

src/diffusers/loaders/lora.py

+    def fuse_lora(
+        self,
+        fuse_unet: bool = True,
+        fuse_text_encoder: bool = True,
+        lora_scale: float = 1.0,
+        safe_fusing: bool = False,
+        adapter_names: Optional[List[str]] = None,
+    ):


So that the earlier public methods still work as expected. Also, showcases the power of reusability nicely.

sayakpaul · 2024-06-24T03:25:29Z

src/diffusers/loaders/lora.py

@@ -522,115 +547,18 @@ def load_lora_into_text_encoder(
                    _pipeline.enable_sequential_cpu_offload()
                # Unsafe code />

-    @classmethod
-    def load_lora_into_transformer(cls, state_dict, network_alphas, transformer, adapter_name=None, _pipeline=None):


This method is only needed for the AmusedPipeline. It's very safe to say that LoRA + Amused isn't used that much. Plus it is convoluting this class unnecessarily.

To address this, I introduced an AmusedLoaderMixin in this PR and performed respective changes in the train_amused.py file. This is the only instance where LoRA + Amused is ever used in our codebase.

sayakpaul · 2024-06-24T03:25:55Z

src/diffusers/loaders/lora.py

    @classmethod
    def save_lora_weights(
        cls,
        save_directory: Union[str, os.PathLike],
        unet_lora_layers: Dict[str, Union[torch.nn.Module, torch.Tensor]] = None,
        text_encoder_lora_layers: Dict[str, torch.nn.Module] = None,
-        transformer_lora_layers: Dict[str, torch.nn.Module] = None,


No need for this argument to interfere here.

sayakpaul · 2024-06-24T03:27:41Z

@Beinsezii could you test with this PR if set_adapters() on SD3 LoRAs work as expected? This is the refactor I mentioned last week.

Thank you in advance.

BenjaminBossan · 2024-06-24T13:58:45Z

Thanks for extending the LoRA functionality to make it work better with transformers. Also thanks for describing your approach.

I have not yet done a full review, as I wanted to ask one thing for my better understanding:

There are two ways to deal with this:

Implement all the above-mentioned methods with the unet component replaced with transformer.

Introduce a new base class that can determine the right type of denoiser being used in a pipeline and perform necessary adjustments for us.

The PR takes the latter approach.

Would it have been possible to roll the 2nd point into the 1st? I.e. introduce the new functionality inside of LoraLoaderMixin instead of creating a new class? SD3LoraLoaderMixin would then inherit from LoraLoaderMixin, same as StableDiffusionXLLoraLoaderMixin for instance, which from my naive point of view would be more intuitive.

If it is important to keep LoraUtilsMixin, I would at the very least consider renaming it. Right now, according to my understanding of the word, it is not a "mixin" class but instead an (abstract) base class and something like class SD3LoraLoaderMixin(LoraUtilsMixin) would be the concrete implementation.

sayakpaul · 2024-06-24T15:03:42Z

Would it have been possible to roll the 2nd point into the 1st? I.e. introduce the new functionality inside of LoraLoaderMixin instead of creating a new class? SD3LoraLoaderMixin would then inherit from LoraLoaderMixin, same as StableDiffusionXLLoraLoaderMixin for instance, which from my naive point of view would be more intuitive.

So, for historical reasons, LoraLoaderMixin is for dealing with Stable Diffusion. StableDiffusionXLLoraLoaderMixin is for SDXL and so on. I could

introduce StableDiffusionLoaderMixin
move the methods of LoraLoaderMixin to StableDiffusionLoaderMixin
make LoraLoaderMixin a subclass of StableDiffusionLoaderMixin.

This way, we will be able to deprecate the LoraLoaderMixin class properly if needed. I hope it provides further clarification?

If it is important to keep LoraUtilsMixin, I would at the very least consider renaming it. Right now, according to my understanding of the word, it is not a "mixin" class but instead an (abstract) base class and something like class SD3LoraLoaderMixin(LoraUtilsMixin) would be the concrete implementation.

I am okay with renaming the LoraUtilsMixin class. Do you have a suggestion? A little confused here with the latter part. Did you mean SD3LoraLoaderMixin(LoraLoadersMixin) because SD3LoraLoaderMixin(LoraUtilsMixin) is currently the case.

BenjaminBossan · 2024-06-25T12:26:19Z

Thanks a lot for explaining.

for historical reasons, LoraLoaderMixin is for dealing with Stable Diffusion. StableDiffusionXLLoraLoaderMixin is for SDXL and so on

Indeed, this is the part that confused me.

I could

introduce StableDiffusionLoaderMixin

move the methods of LoraLoaderMixin to StableDiffusionLoaderMixin

make LoraLoaderMixin a subclass of StableDiffusionLoaderMixin.

This way, we will be able to deprecate the LoraLoaderMixin class properly if needed.

Regarding the last point, you mean a subclass without methods, so basically just an alias (+ potentially deprecation message)? Is this so that user code that relies on LoraLoaderMixin can still function? If this is part of the "public" API of diffusers, that would be all right with me, otherwise I would consider even removing LoraLoaderMixin completely.

I am okay with renaming the LoraUtilsMixin class. Do you have a suggestion?

This would probably be better named as LoraMixinBase or something along this lines, to signify that it's a base class for LoRA mixin classes.

Did you mean SD3LoraLoaderMixin(LoraLoadersMixin) because SD3LoraLoaderMixin(LoraUtilsMixin) is currently the case.

No, I meant that SD3LoraLoaderMixin(LoraUtilsMixin) can be kept, just with a new name like SD3LoraLoaderMixin(LoraMixinBase). This would make it clear that SD3LoraLoaderMixin is a concrete implementation of LoraMixinBase for SD3, StableDiffusionLoaderMixin is the concrete implementation for SD, etc.

sayakpaul · 2024-06-25T12:34:56Z

Regarding the last point, you mean a subclass without methods, so basically just an alias (+ potentially deprecation message)? Is this so that user code that relies on LoraLoaderMixin can still function? If this is part of the "public" API of diffusers, that would be all right with me, otherwise I would consider even removing LoraLoaderMixin completely.

No we cannot remove LoraLoaderMixin completely as it's quite heavily used.

Is this so that user code that relies on LoraLoaderMixin can still function?

To keep the scope of this PR relatively manageable, I would prefer to handle the deprecation part in a future PR.

I will reflect the rest of the feedback for this PR and ping you once done.

As always, thanks for being thorough! This helps a lot!

Beinsezii · 2024-06-26T03:09:10Z

@Beinsezii could you test with this PR if set_adapters() on SD3 LoRAs work as expected? This is the refactor I mentioned last week.

Thank you in advance.

Using both the minimal repro code in #8565 and my own app on this branch I tried about a dozen different combinations of multiple loras, adapter strengths, and [non]fuses. Seems to work effectively the same as the XL/SD pipes now for my cases.

All done on AMD gfx1100; other compute devices MMV.

sayakpaul · 2024-06-26T03:18:38Z

@Beinsezii perfect, thanks for testing!

sayakpaul · 2024-06-26T04:35:40Z

src/diffusers/pipelines/controlnet_sd3/pipeline_stable_diffusion_3_controlnet.py

+    USE_PEFT_BACKEND,
    is_torch_xla_available,
    logging,
    replace_example_docstring,
+    scale_lora_layers,
+    unscale_lora_layers,


Changes to support the results of make fix-copies.

sayakpaul · 2024-06-26T04:35:51Z

src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3_img2img.py

@@ -25,13 +25,17 @@
 )

 from ...image_processor import PipelineImageInput, VaeImageProcessor
+from ...loaders import SD3LoraLoaderMixin


Changes to support the results of make fix-copies.

sayakpaul · 2024-06-26T04:37:11Z

tests/lora/test_lora_layers_sd3.py

    pipeline_class = StableDiffusion3Pipeline
-
-    def get_dummy_components(self):


No need as we more thoroughly cover all kinds of tests in PeftLoraLoaderMixinTests already.

sayakpaul · 2024-06-26T04:38:28Z

@BenjaminBossan please take it away for reviewing.

BenjaminBossan

Thanks for refactoring the LoRA mixin classes to be easier to extend to new pipelines in the future.

This one is honestly quite difficult for me to review, as there have been a lot of changes but mostly they're about moving things around to better organize the code. Therefore, I haven't checked line by line but rather tried to understand the overall direction of the change.

When it comes to correctness, the tests should cover this. There have been some changes to the tests at the same time, so it's not completely clear to me if the tests still cover the exact same cases (or strictly more) as previously. Maybe in the future it would make sense to split the refactor of the code from the refactor of the tests. That way, we can have more confidence that the refactor of the code does not break anything.

Overall, I think the direction of the change is a good one, so from my point of view, this looks good.

BenjaminBossan · 2024-06-26T14:14:17Z

src/diffusers/loaders/lora.py

-    @classmethod
-    def _best_guess_weight_name(
-        cls, pretrained_model_name_or_path_or_dict, file_extension=".safetensors", local_files_only=False
+    def fuse_lora(


Does this method only exist to map fuse_unet to fuse_denoiser? If yes, this could be added as a comment. Same for other similar methods.

don't think this method is needed, we can just map the argument from the method in LoraBaseMixin
we should have this class for loading only and LoraBaseMixin for the other methods

@yiyixuxu I hear your point but I feel relatively strongly about having this method to act as an interface (as explained in the comments).

Otherwise, the fuse_lora() method would need to have fuse_unet, fuse_transformer arguments and we may have to support other arguments based on the kind of diffusion backbone newly introduced in the literature (a mamba, for example).

With the current implementation, we have a nice logical segregation of the concepts through the fuse_denoiser argument. The denoiser here can be any model architecture in practice as long as there's compatibility and support.

Furthermore, when the users would call fuse_lora() on an SD3 pipeline they would see fuse_unet in the docstrings or on their IDEs, too. This can be confusing for them.

So, keeping all of these in mind, I would prefer the current implementation.

we can start to deprecate fuse_unet, fuse_tranformer etc and only use fuse_denoiserfrom this point on, no?
when these fuse_unet, fuse_tranformer are passed in **kwargs, we can map it to fuse_denoiser

Yes that makes perfect sense to me. Will do that :)

b404a32 should have taken care of it.

BenjaminBossan · 2024-06-26T14:32:39Z

src/diffusers/loaders/transformer_sd3.py

+logger = logging.get_logger(__name__)
+
+
+class SD3TransformerLoadersMixin:


Is this basically on the same level as LoraLoaderMixin but it doesn't make sense to inherit from LoraBaseMixin?

No, so this is analogous to https://github.com/huggingface/diffusers/blob/main/src/diffusers/loaders/unet.py, which implements the LoRA-level functionalities at the model level.

LoraBaseMixin is for pipelines.

Hence I stated in the PR description:

Currently, we re-implement some methods in both UNet2DLoaderMixin and SD3TransformerLoaderMixin with the "# Copied from ..." mechanism. In a future PR, we can consider introducing a ModelLoaderMixin so that we can share methods like set_adapter() on the ModelMixin level. This should be relatively easy to do.

sayakpaul · 2024-06-26T16:46:09Z

When it comes to correctness, the tests should cover this. There have been some changes to the tests at the same time, so it's not completely clear to me if the tests still cover the exact same cases (or strictly more) as previously. Maybe in the future it would make sense to split the refactor of the code from the refactor of the tests. That way, we can have more confidence that the refactor of the code does not break anything.

Fair concern but the refactoring here genuinely affects the tests too. So, I couldn't find a better way out :( For example, we moved set_adapters() to the base class. Now SD and SDXL tests for that method are covered through PeftLoraLoaderMixinTests. But testing for that method in SD3 meant either of the two:

Duplicate the test from PeftLoraLoaderMixinTests, making the necessary adjustments for SD3.
Make adjustments to PeftLoraLoaderMixinTests and make it a subclass of the SD3 LoRA test suite.

The latter made more sense to me. Hopefully that helps clarify why the test related changes had to be reflected here.

sayakpaul · 2024-06-26T16:52:53Z

@yiyixuxu @DN6 this is ready for a review now.

yiyixuxu · 2024-07-02T19:03:58Z

src/diffusers/loaders/lora.py

-    @classmethod
-    def _best_guess_weight_name(
-        cls, pretrained_model_name_or_path_or_dict, file_extension=".safetensors", local_files_only=False
+    def fuse_lora(


don't think this method is needed, we can just map the argument from the method in LoraBaseMixin
we should have this class for loading only and LoraBaseMixin for the other methods

yiyixuxu · 2024-07-02T19:07:09Z

src/diffusers/loaders/lora.py

-                            text_encoder_module.lora_magnitude_vector[
-                                adapter_name
-                            ] = text_encoder_module.lora_magnitude_vector[adapter_name].to(device)
-

 class StableDiffusionXLLoraLoaderMixin(LoraLoaderMixin):


no need to inherit from LoraLoaderMixin, no? it defines its own load_lora_weights and save_lora_weights.

we are not crazy about inherence but with this we have

pipeline -> StableDiffusionXLLoraLoaderMixin -> LoraLoaderMixin -> LoraBaseMixin

I think it is a bit of too much

can we try to make it work with something like XXPipeline(StableDiffusionXLLoraLoaderMixin, LoraBaseMixin)

It shares the load_lora_into_unet() and load_lora_into_text_encoder() methods from LoraLoaderMixin which includes logic for loading Kohya and other popular non-diffusers LoRA checkpoints.

If we proceed with the option you suggested, we would need to have copies of those methods in StableDiffusionXLLoraLoaderMixin.

can we move these shared methods into LoraBaseMixin then?

XXPipeline -> XXLoraLoaderMixin -> LoraBaseMixin

Well, load_lora_into_unet() isn't applicable for Transformer based diffusion pipelines. Plus StableDiffusionXLLoraLoaderMixin shares the lora_state_dict() method from LoraLoaderMixin as well which is again very specific to the SD family and not shared by other class of pipelines such as SD3.

ok, let's copy the method then

It shares the load_lora_into_unet() and load_lora_into_text_encoder() methods from LoraLoaderMixin which includes logic for loading Kohya and other popular non-diffusers LoRA checkpoints.

I only found one load_lora_into_text_encoder method on LoraBaseMixin, no? so the only thing we need is to copy load_lora_into_unet to StableDiffusionXLLoraLoaderMixin

also, does it make sense to rename LoraLoaderMixin to StableDiffusionLoraLoaderMixin?

Done on ad532a0. LMK.

…)" This reverts commit a2071a1.

) Revert "[LoRA] introduce `LoraBaseMixin` to promote reusability. (#8670)" This reverts commit a2071a1.

* introduce to promote reusability. * up * add more tests * up * remove comments. * fix fuse_nan test * clarify the scope of fuse_lora and unfuse_lora * remove space

) Revert "[LoRA] introduce `LoraBaseMixin` to promote reusability. (#8670)" This reverts commit a2071a1.

introduce to promote reusability.

b66885b

sayakpaul added refactor lora labels Jun 24, 2024

sayakpaul requested a review from BenjaminBossan June 24, 2024 03:16

sayakpaul commented Jun 24, 2024

View reviewed changes

up

124b698

sayakpaul marked this pull request as ready for review June 24, 2024 08:20

sayakpaul mentioned this pull request Jun 24, 2024

[Tencent Hunyuan Team] Add LoRA Inference Support for Hunyuan-DiT #8468

Closed

sayakpaul added 3 commits June 24, 2024 14:04

Merge branch 'main' into feat-lora-base-class

8828863

Merge branch 'main' into feat-lora-base-class

31bdbbf

add more tests

bb03165

sayakpaul added 2 commits June 24, 2024 20:33

Merge branch 'main' into feat-lora-base-class

d60445c

Merge branch 'main' into feat-lora-base-class

5ce5c8b

up

6448e92

up

865b94f

sayakpaul changed the title ~~[LoRA] introduce LoraUtilsMixin to promote reusability.~~ [LoRA] introduce LoraBaseMixin to promote reusability. Jun 26, 2024

Merge branch 'main' into feat-lora-base-class

b68e385

sayakpaul commented Jun 26, 2024

View reviewed changes

remove comments.

4af4a8d

sayakpaul commented Jun 26, 2024

View reviewed changes

fix fuse_nan test

1c80fa7

BenjaminBossan approved these changes Jun 26, 2024

View reviewed changes

sayakpaul added 2 commits June 26, 2024 22:17

Merge branch 'main' into feat-lora-base-class

1e5a5ed

clarify the scope of fuse_lora and unfuse_lora

6870915

sayakpaul requested review from DN6 and yiyixuxu June 26, 2024 16:52

sayakpaul added 4 commits June 27, 2024 12:24

resolve conflicts.

45404e9

remove space

6658771

Merge branch 'main' into feat-lora-base-class

0990909

Merge branch 'main' into feat-lora-base-class

13f46ea

yiyixuxu reviewed Jul 2, 2024

View reviewed changes

Merge branch 'main' into feat-lora-base-class

6ccfc35

sayakpaul merged commit a2071a1 into main Jul 3, 2024
17 of 18 checks passed

sayakpaul deleted the feat-lora-base-class branch July 3, 2024 01:34

sayakpaul added a commit that referenced this pull request Jul 3, 2024

Revert "[LoRA] introduce LoraBaseMixin to promote reusability. (#8670…

168900c

…)" This reverts commit a2071a1.

sayakpaul mentioned this pull request Jul 3, 2024

Revert "[LoRA] introduce LoraBaseMixin to promote reusability." #8773

Merged

sayakpaul added a commit that referenced this pull request Jul 3, 2024

Revert "[LoRA] introduce LoraBaseMixin to promote reusability." (#8773

984d340

) Revert "[LoRA] introduce `LoraBaseMixin` to promote reusability. (#8670)" This reverts commit a2071a1.

sayakpaul restored the feat-lora-base-class branch July 3, 2024 01:35

sayakpaul mentioned this pull request Jul 3, 2024

[LoRA] introduce LoraBaseMixin to promote reusability. #8774

Merged

sayakpaul added a commit that referenced this pull request Dec 23, 2024

Revert "[LoRA] introduce LoraBaseMixin to promote reusability." (#8773

11c5f6b

) Revert "[LoRA] introduce `LoraBaseMixin` to promote reusability. (#8670)" This reverts commit a2071a1.

		pipeline_class = StableDiffusion3Pipeline

		def get_dummy_components(self):

		logger = logging.get_logger(__name__)


		class SD3TransformerLoadersMixin:

[LoRA] introduce LoraBaseMixin to promote reusability. #8670

[LoRA] introduce LoraBaseMixin to promote reusability. #8670

Conversation

sayakpaul commented Jun 24, 2024 • edited Loading

What does this PR do?

Long description

HuggingFaceDocBuilderDev commented Jun 24, 2024

Choose a reason for hiding this comment

sayakpaul Jun 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sayakpaul commented Jun 24, 2024

BenjaminBossan commented Jun 24, 2024

sayakpaul commented Jun 24, 2024

BenjaminBossan commented Jun 25, 2024

sayakpaul commented Jun 25, 2024

Beinsezii commented Jun 26, 2024

sayakpaul commented Jun 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sayakpaul commented Jun 26, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yiyixuxu Jul 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sayakpaul commented Jun 26, 2024

sayakpaul commented Jun 26, 2024

Choose a reason for hiding this comment

yiyixuxu Jul 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sayakpaul Jul 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[LoRA] introduce `LoraBaseMixin` to promote reusability. #8670

[LoRA] introduce `LoraBaseMixin` to promote reusability. #8670

sayakpaul commented Jun 24, 2024 •

edited

Loading

sayakpaul Jun 24, 2024 •

edited

Loading

yiyixuxu Jul 3, 2024 •

edited

Loading

yiyixuxu Jul 2, 2024 •

edited

Loading

sayakpaul Jul 3, 2024 •

edited

Loading