Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flex.1 Alpha LoRA/Finetuning #3056

Open
stepfunction83 opened this issue Jan 21, 2025 · 14 comments
Open

Flex.1 Alpha LoRA/Finetuning #3056

stepfunction83 opened this issue Jan 21, 2025 · 14 comments

Comments

@stepfunction83
Copy link

stepfunction83 commented Jan 21, 2025

I think this would be a good place to discuss finetuning the new Flex.1 Alpha model created by Ostris: https://huggingface.co/ostris/Flex.1-alpha

Initial tests I've tried on training LoRAs using ai-toolkit are extremely promising, with LoRAs being able to be trained much more smoothly than with Flux.1 Dev.

Currently, I believe we can train this in Kohya in a similar way to how the un-distilled versions of Flux have been trained, by treating them as Flux Schnell to bypass the guidance mechanism. Until this is built in though, you can force it by temporarily changing line 62 in library/flux_utils.py, line 62 from:

is_schnell = not ("guidance_in.in_layer.bias" in keys or "time_text_embed.guidance_embedder.linear_1.bias" in keys)

to:

is_schnell = True

This lets the finetuning/LoRA process begin a training run. I'm currently doing a test run of this and will post about how it goes. Obviously, it hasn't been out particularly long, but so far I have been able to start a finetuning run and the loss seems to be decreasing.

Due to the model's smaller size, it can fit entirely on a 24GB card with the fused backward pass and no block swap, resulting in faster training iterations (average of 2.54s/it on my 4090 when using an even mix of 512/768/1024 resolution images). I used exactly the same config that I use for a normal Flux run, only swapping out the model file.

Samples are garbled, as they were with the undistilled versions, so I expect there will need to be some fixes there, but they're not beyond recognition.

@stepfunction83
Copy link
Author

My initial attempt with a LR or 1e-5 overtrained rapidly. A second attempt with a LR of 2e-6 seems to be more stable so far.

@stepfunction83
Copy link
Author

Before anyone else tries this, it seems to break the guidance module that Ostris created. It seems some more work will be needed to explicitly exclude that from training.

@CodeAlexx
Copy link

I'm just using standard flux settings with block swapping and other stuff at 1.8-5 lr at 24k steps already and samples are decent, also just a dataset of all 1024 x 1024. No multires res stuff . My 3090 does 7.89 it's..I will try your method later when I get to 50k steps

@AfterHAL
Copy link

My 3090 does 7.89 it's..

@CodeAlexx . Are you saying 7.89 seconds per iterations ?

@stepfunction83
Copy link
Author

My 3090 does 7.89 it's..

@CodeAlexx . Are you saying 7.89 seconds per iterations ?

That sounds about right given it's all 1024 resolution images. I'm using a 512/768/1024 blend on a 4090 for the 2.5s/it times.

@CodeAlexx
Copy link

Yes, near 8 seconds per step. How are your samples with your hack ?

@stepfunction83
Copy link
Author

The samples look decent, but if you try to use it in Comfy, you'll find that guidance no longer works. The training process is likely training that part of the network as well, not realizing that it should be ignoring it.

@stepfunction83
Copy link
Author

kohya-ss/sd-scripts#1891 (comment)

If anyone wants to play with this, I've created a minimal working example here:

https://github.com/stepfunction83/sd-scripts/tree/sd3

With this commit just brute forcing in the relevant code snippets from ai-toolkit:

kohya-ss/sd-scripts@b203e31

I was able to quickly train a 1000 step finetune of flex and was able to test it in Comfy to validate that the training does take and the guidance module is not destroyed in the process.

Additionally, the sampling was corrected as well and now works as expected.

You can replace the default sd-scripts installation that comes with Kohya with this one and replace the Flux model file with the Flex version.

@CodeAlexx
Copy link

THANK YOU!! i am new to git and how to use it, can i just download the three changed files and replace them.

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 23, 2025

Make sure to pass the --bypass_flux_guidance parameter with the latest commit, and yes, you can just replace the respective files with the ones from the forked version.

@CodeAlexx
Copy link

thank you sooooo much!

@stepfunction83
Copy link
Author

Yep, let me know how your experience goes. I'll submit a PR once I get it in a slightly better state.

@CodeAlexx
Copy link

i will, i won't use for lora but finetuning with no block swaps

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 23, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants