Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text-to-image with StableDiffusion and gpu acceleration in node.js #121

Closed
wants to merge 5 commits into from

Conversation

dakenf
Copy link

@dakenf dakenf commented May 23, 2023

I've used my patched onnxruntime-node to support GPU, however can leave it with standard one and add instructions how to change it. Will soon open a pull request in official runtime (need some time to work out their build system as its a very big monorepo)

Added support for SD2.1 and 1.X, updated CLIP tokenizer to support the model

Added various tensor functions

In node it now returns filename for model files, as otherwise big model files would be loaded in CPU mem and sent to onnxruntime. Basically there would be two copies of each model in the memory. Now runtime loads them with a filename internally.

Let me know if some refactoring or improvements needed.

@xenova
Copy link
Collaborator

xenova commented May 23, 2023

WOW! 🔥 This is HUGE!

I've used my patched onnxruntime-node to support GPU, however can leave it with standard one and add instructions how to change it. Will soon open a pull request in official runtime (need some time to work out their build system as its a very big monorepo)

It would be ideal if onnxruntime would support it officially. It seems like a similar problem with their WebGPU backend. I've been testing it and playing around with it for the past month or so, but since it's not officially released yet, I'm hesitant to make a full release with it. I think the best approach (as you mentioned) will be instructions to users to show how to "activate" this functionality. Then, later when it's officially supported, include it directly.

Added support for SD2.1 and 1.X, updated CLIP tokenizer to support the model
Added various tensor functions

Perfect!

In node it now returns filename for model files, as otherwise big model files would be loaded in CPU mem and sent to onnxruntime. Basically there would be two copies of each model in the memory. Now runtime loads them with a filename internally.

tbh I didn't know you could load by file name. Of course, I originally only designed it for the web, so, I must have missed there was a different API. In fact, that's probably something you can pull out of this PR and submit a new one (if it affects node users now)?

@kungfooman
Copy link
Contributor

kungfooman commented May 23, 2023

Basically there would be two copies of each model in the memory. Now runtime loads them with a filename internally.

So your ORT PR will load a ONNX model converted from ckpt/safetensors?

EDIT: found it already: https://huggingface.co/aislamov/stable-diffusion-2-1-base-onnx

Would be nice to have a few SD models converted already on HuggingFace, so most people don't need to convert themselves.

Astonishing work, thank you!

@dakenf
Copy link
Author

dakenf commented May 23, 2023

It would be ideal if onnxruntime would support it officially.

yeah. i'm going to open a pull request there in a few days to make it work out of the box. also want to add an api to get input types because SD 1.X take int as timestep input and 2.X takes float. there is no way to check it from JS and right now users must explicitly tell which SD version they are loading

tbh I didn't know you could load by file name. Of course, I originally only designed it for the web, so, I must have missed there was a different API. In fact, that's probably something you can pull out of this PR and submit a new one (if it affects node users now)?

ok, there are just a few changes for it

So your ORT PR will load a ONNX model converted from ckpt/safetensors?
EDIT: found it already: https://huggingface.co/aislamov/stable-diffusion-2-1-base-onnx
Would be nice to have a few SD models converted already on HuggingFace, so most people don't need to convert themselves.

there's also an official CompVis/stable-diffusion-v1-4 with onnx revision. i think i should make a separate doc page with examples and available models. to mention this tool for SD to ONNX conversion https://github.com/Amblyopius/Stable-Diffusion-ONNX-FP16

and for 1.X i'll need to make safety checker implementation (2.X does not require it)

i guess i can also convert some popular ones and upload them to HF hub, just need to find out which ones

@xenova
Copy link
Collaborator

xenova commented May 23, 2023

i guess i can also convert some popular ones and upload them to HF hub, just need to find out which ones

It would probably be best if it works with exports from Hugging Face's optimum library. All other models are assumed to be exported from optimum.

On another note, it might be good to separate diffusion models into another diffusers.js library. 👀 (In a similar way to HF's diffusers library is separate from transformers). There will be dependencies between the two (from diffusers to transformers) to handle processors and tokenizers.

@dakenf
Copy link
Author

dakenf commented May 23, 2023

On another note, it might be good to separate diffusion models into another diffusers.js library. 👀 (In a similar way to HF's diffusers library is separate from transformers). There will be dependencies between the two (from diffusers to transformers) to handle processors and tokenizers.

sounds good. will also check optimum this week

@dakenf
Copy link
Author

dakenf commented May 23, 2023

I've created a pull request in onnxruntime, let see how it goes microsoft/onnxruntime#16050
In the meantime i'll update my patched runtume to register only "cuda" and "dml" backends so it can be used with a simple import "onnxruntime-node-gpu" and passing desired executionProviders to create a session.

Will create a diffusers.js repo to have SD code there and create some other PRs to add required math functions to Tensor class

So i guess this PR can be closed for now

@dakenf
Copy link
Author

dakenf commented Jul 21, 2023

Closing this one as changes are unnecessary and I'm going to release diffusers.js after implementing 5 outstanding WebGPU kernels required for StableDiffusion. Hopefully after that it will be fast enough for consumer use

@dakenf dakenf closed this Jul 21, 2023
@kungfooman kungfooman mentioned this pull request Aug 29, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants