This work is a port of the diffusers-rs library written in Rust.
- Download weights and save it data directory. For instructions on how to do this, refer diffusers-rs-docs
- Once you have “.ot” files, you are ready to run stable diffusion
- I have found that placing unet and clip on GPU, and placing vae
on cpu works the best for my GPU. Here is the command that does that.
dune exec stable_diffusion -- generate "lighthouse at dark" "vae" "data/pytorch_model.ot" "data/vae.ot" "data/unet.ot"
- To run all models in CPU
dune exec stable_diffusion -- generate "lighthouse at dark" "all" "data/pytorch_model.ot" "data/vae.ot" "data/unet.ot"
- To generate more than 1 sample, use the num_samples parameter
dune exec stable_diffusion -- generate "lighthouse at dark" "all" "data/pytorch_model.ot" "data/vae.ot" "data/unet.ot" --num_samples=2
It takes about 27 seconds to generate an image. Measurements were done on a 12 CPU Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz. Running all the steps in CPU takes little more than three minutes. I place vae on CPU; unet and clip on GPU
real 0m25.110s
user 0m39.271s
sys 0m11.518s
dune exec img2img -- img2img media/in_img2img.jpg
- Input image
- Output image
I placed vae on CPU; unet and clip on GPU
real 0m15.628s
user 0m34.571s
sys 0m5.833s
dune exec inpaint -- generate media/sd_input.png media/sd_mask.png --cpu="vae" --prompt="Face of a panda, high resolution, sitting on a park bench"
- Prompt: Face of a panda, high resolution, sitting on a park bench
I placed vae on CPU; unet and clip on GPU
real 0m28.055s
user 0m51.600s
sys 0m13.851s
Only FP32 weights are supported.