-
Notifications
You must be signed in to change notification settings - Fork 6.6k
[SD-XL] Flax implementation #4136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Awesome @mar-muel, many thanks! We'll take a look soon! |
| for tokenizer in enumerate(tokenizers): | ||
| text_inputs = self.tokenizer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for tokenizer in enumerate(tokenizers): | |
| text_inputs = self.tokenizer( | |
| for tokenizer in tokenizers: | |
| text_inputs = tokenizer( |
Fixed prompt embedding shapes so they work in parallel mode. Assuming we always have both text encoders for now, for simplicity.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
Adds Flax implementation for SD-XL pipeline 🚀
Related issue: #4007
This is still WIP unfortunately 😅
from_pt=True, however this might currently not work out of the box fortext_encoder_2(as transformers currently doesn't supportCLIPTextModelWithProjectionfor Flax). I've pushed my version of the converted flax weights to this repo for now: https://huggingface.co/nyxai-lab/sdxl-0.9-flax/tree/mainIt seems the text encoders currently yield correct results for hidden states, but still getting different results for pooled layersEDIT: Upon checking again the outputs seem close enoughHope this can serve as a starting point :)
EDIT: Adding some outputs here (based on this PR):
a beautiful stack of rocks sitting on top of a beach, a picture, red black white golden colors, chakras, packshot, stock photophoto of a rhino dressed suit and tie sitting at a table in a bar with a bar stools, award winning photography, Elke vogelsang