-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[IP Adapters] introduce ip_adapter_image_embeds in the SD pipeline call
#6868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
yiyixuxu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks so much for adding this so quickly:) it's looking great!
I left a comment for the unload_ip_adapter portion
|
thank you @sayakpaul, so as I understand it, we need to pass the image embeddings for each IP Adapter which is cool, to be able to mix between an image, list of images or embeddings for each IP Adapter is exactly what I was looking for. the only other issue left in my list but maybe we can discuss it later and not in this PR, is that diffusers is the only library/app that does the resampling in the forward of the unet instead of when getting the embeddings for the images, this would prevent that we can use embeddings from other apps or libraries and vice versa. It would be good to do it here but I remember @yiyixuxu telling me that you were thinking of taking the image projection outside of the unet so maybe we can discuss it then. |
I don't understand it. What's resampling in this context? Prefer taking references to the |
|
Oh sorry, I meant the Image projection, is what is done here in diffusers:
|
|
Oh in that case, that warrants a separate PR / discussion. |
|
@asomoza The short answer is:
|
yiyixuxu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me:) thank you
can we look into that failing test and make sure it is unrelated here?
…call (huggingface#6868) * add: support for passing ip adapter image embeddings * debugging * make feature_extractor unloading conditioned on safety_checker * better condition * type annotation * index to look into value slices * more debugging * debugging * serialize embeddings dict * better conditioning * remove unnecessary prints. * Update src/diffusers/loaders/ip_adapter.py Co-authored-by: YiYi Xu <[email protected]> * make fix-copies and styling. * styling and further copy fixing. * fix: check_inputs call in controlnet sdxl img2img pipeline --------- Co-authored-by: YiYi Xu <[email protected]>
What does this PR do?
As per the discussion of #6830.
Testing script:
Here's our cute bear:
We could introduce static methods namely
_encode_ip_adapter_image()and_prepare_ip_adapter_image_embedsand delegate the current calls ofencode_image()andprepare_ip_adapter_image_embeds()to them, respectively. This way, it should be possible for users to not to codeencode_image()andprepare_ip_adapter_image_embeds()explicitly like shown above.So the flow would be like: