Text-to-image with StableDiffusion and gpu acceleration in node.js #121

dakenf · 2023-05-23T02:10:18Z

I've used my patched onnxruntime-node to support GPU, however can leave it with standard one and add instructions how to change it. Will soon open a pull request in official runtime (need some time to work out their build system as its a very big monorepo)

Added support for SD2.1 and 1.X, updated CLIP tokenizer to support the model

Added various tensor functions

In node it now returns filename for model files, as otherwise big model files would be loaded in CPU mem and sent to onnxruntime. Basically there would be two copies of each model in the memory. Now runtime loads them with a filename internally.

Let me know if some refactoring or improvements needed.

xenova · 2023-05-23T09:01:59Z

WOW! 🔥 This is HUGE!

I've used my patched onnxruntime-node to support GPU, however can leave it with standard one and add instructions how to change it. Will soon open a pull request in official runtime (need some time to work out their build system as its a very big monorepo)

It would be ideal if onnxruntime would support it officially. It seems like a similar problem with their WebGPU backend. I've been testing it and playing around with it for the past month or so, but since it's not officially released yet, I'm hesitant to make a full release with it. I think the best approach (as you mentioned) will be instructions to users to show how to "activate" this functionality. Then, later when it's officially supported, include it directly.

Added support for SD2.1 and 1.X, updated CLIP tokenizer to support the model
Added various tensor functions

Perfect!

In node it now returns filename for model files, as otherwise big model files would be loaded in CPU mem and sent to onnxruntime. Basically there would be two copies of each model in the memory. Now runtime loads them with a filename internally.

tbh I didn't know you could load by file name. Of course, I originally only designed it for the web, so, I must have missed there was a different API. In fact, that's probably something you can pull out of this PR and submit a new one (if it affects node users now)?

kungfooman · 2023-05-23T09:53:04Z

Basically there would be two copies of each model in the memory. Now runtime loads them with a filename internally.

So your ORT PR will load a ONNX model converted from ckpt/safetensors?

EDIT: found it already: https://huggingface.co/aislamov/stable-diffusion-2-1-base-onnx

Would be nice to have a few SD models converted already on HuggingFace, so most people don't need to convert themselves.

Astonishing work, thank you!

dakenf · 2023-05-23T12:36:35Z

It would be ideal if onnxruntime would support it officially.

yeah. i'm going to open a pull request there in a few days to make it work out of the box. also want to add an api to get input types because SD 1.X take int as timestep input and 2.X takes float. there is no way to check it from JS and right now users must explicitly tell which SD version they are loading

tbh I didn't know you could load by file name. Of course, I originally only designed it for the web, so, I must have missed there was a different API. In fact, that's probably something you can pull out of this PR and submit a new one (if it affects node users now)?

ok, there are just a few changes for it

So your ORT PR will load a ONNX model converted from ckpt/safetensors?
EDIT: found it already: https://huggingface.co/aislamov/stable-diffusion-2-1-base-onnx
Would be nice to have a few SD models converted already on HuggingFace, so most people don't need to convert themselves.

there's also an official CompVis/stable-diffusion-v1-4 with onnx revision. i think i should make a separate doc page with examples and available models. to mention this tool for SD to ONNX conversion https://github.com/Amblyopius/Stable-Diffusion-ONNX-FP16

and for 1.X i'll need to make safety checker implementation (2.X does not require it)

i guess i can also convert some popular ones and upload them to HF hub, just need to find out which ones

xenova · 2023-05-23T12:43:18Z

i guess i can also convert some popular ones and upload them to HF hub, just need to find out which ones

It would probably be best if it works with exports from Hugging Face's optimum library. All other models are assumed to be exported from optimum.

On another note, it might be good to separate diffusion models into another diffusers.js library. 👀 (In a similar way to HF's diffusers library is separate from transformers). There will be dependencies between the two (from diffusers to transformers) to handle processors and tokenizers.

dakenf · 2023-05-23T13:20:58Z

On another note, it might be good to separate diffusion models into another diffusers.js library. 👀 (In a similar way to HF's diffusers library is separate from transformers). There will be dependencies between the two (from diffusers to transformers) to handle processors and tokenizers.

sounds good. will also check optimum this week

dakenf · 2023-05-23T23:05:06Z

I've created a pull request in onnxruntime, let see how it goes microsoft/onnxruntime#16050
In the meantime i'll update my patched runtume to register only "cuda" and "dml" backends so it can be used with a simple import "onnxruntime-node-gpu" and passing desired executionProviders to create a session.

Will create a diffusers.js repo to have SD code there and create some other PRs to add required math functions to Tensor class

So i guess this PR can be closed for now

dakenf · 2023-07-21T15:56:14Z

Closing this one as changes are unnecessary and I'm going to release diffusers.js after implementing 5 outstanding WebGPU kernels required for StableDiffusion. Hopefully after that it will be fast enough for consumer use

dakenf added 5 commits May 23, 2023 04:30

Text to Image with StableDiffusion

ad00b66

Fixed memory usage in node.js for big models

7c75003

Merge branch 'xenova:main' into main

46f8ee5

StableDiffusionClipTokenizer cleanup

cbae680

Moved GPU-related section under Installation block in Readme

447a934

varunneal mentioned this pull request Jul 8, 2023

WebGPU Support do-me/SemanticFinder#11

Open

dakenf closed this Jul 21, 2023

kungfooman mentioned this pull request Aug 29, 2024

Stable Diffusion #908

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text-to-image with StableDiffusion and gpu acceleration in node.js #121

Text-to-image with StableDiffusion and gpu acceleration in node.js #121

dakenf commented May 23, 2023

xenova commented May 23, 2023

kungfooman commented May 23, 2023 •

edited

Loading

dakenf commented May 23, 2023

xenova commented May 23, 2023

dakenf commented May 23, 2023

dakenf commented May 23, 2023

dakenf commented Jul 21, 2023

Text-to-image with StableDiffusion and gpu acceleration in node.js #121

Text-to-image with StableDiffusion and gpu acceleration in node.js #121

Conversation

dakenf commented May 23, 2023

xenova commented May 23, 2023

kungfooman commented May 23, 2023 • edited Loading

dakenf commented May 23, 2023

xenova commented May 23, 2023

dakenf commented May 23, 2023

dakenf commented May 23, 2023

dakenf commented Jul 21, 2023

kungfooman commented May 23, 2023 •

edited

Loading