Skip to content

hyscaler/Stable-Diffusion-XL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

How to Set Up and Use Stable Diffusion XL Locally

Introduction

Stable Diffusion XL is a cutting-edge AI model designed for generating high-resolution images from textual descriptions. This model, developed by Stability AI, leverages the power of deep learning to transform text prompts into vivid and detailed images, offering new horizons in the field of digital art and design. Refer to the Stable Diffusion XL for more details.

Requirements

To get started, ensure you have:

  • Python version 3.10.6
  • A suitable development environment

Step-by-Step Guide

1. Setup Python and Pip

Python is the backbone of this project, and Pip is the package manager used to install necessary libraries. First, ensure you have Python installed. Then, install or upgrade Pip using:

python3 -m pip install --user --upgrade pip

This command ensures you have the latest version of Pip installed.

2. Creating a Virtual Environment

A virtual environment in Python is a self-contained directory that contains a Python installation for a particular project. This approach avoids conflicts between project dependencies. Create and activate a virtual environment with:

python3 -m venv .venv
source .venv/bin/activate

3. Creating a Hugging Face Account

To access pre-trained models, including Stable Diffusion XL, sign up for a Hugging Face account. This platform hosts various machine learning models and offers community and professional support.

4. Installing Required Packages

Install Diffusers and Gradio using Pip. Diffusers is a Hugging Face library offering pre-trained diffusion models for various tasks. Gradio is an open-source library for creating user-friendly web interfaces for machine learning models.

pip install diffusers gradio torch transformers accelerate

5. Initial Code Setup

Write the following code in a file named main.py. This script sets up a basic Gradio interface to interact with the Stable Diffusion XL model. In this we are using Stable Diffusion XL Base model stabilityai/stable-diffusion-xl-base-1.0 which is provided by Stability AI. Source code is available at https://github.com/Stability-AI/generative-models . And to using the model we are using StableDiffusionXLPipeline from diffusers for text-to-image generation.

import gradio as gr
from diffusers import StableDiffusionXLPipeline
import torch

# Load the model with specified parameters
base = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True, 
)
base.to("cuda")

def query(payload):
    # Generate image based on the text prompt
    image = base(prompt=payload).images[0]
    return image

# Setup Gradio interface
demo = gr.Interface(fn=query, inputs="text", outputs="image")

if __name__ == "__main__":
    demo.launch(show_api=False)

To run this script, use:

python3 main.py

This will provide you with a URL. When you open this URL, it will take you to the Gradio interface where you can input your prompts.

[Will Insert Screenshot of Gradio Interface for Basic Setup Here]

Base Model Generation Base Model Generation

6. Generating Multiple Images

For generating multiple images, modify the script as follows.

import gradio as gr
from diffusers import StableDiffusionXLPipeline
import torch

# Load the model
base = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True, 
)
base.to("cuda")

def query(payload):
    # Generate multiple images based on the text prompt
    image = base(prompt=payload, num_images_per_prompt=2).images
    return image

# Setup Gradio interface for multiple images
demo = gr.Interface(fn=query, inputs="text", outputs="gallery")

if __name__ == "__main__":
    demo.launch(show_api=False)

[Will Insert Screenshot of Gradio Interface for Multiple Images Here]

Multiple Image Generation 2 Multiple Image Generation

7. Using Base and Refiner Models

For refined images, modify your script to include the base stabilityai/stable-diffusion-xl-base-1.0 and refiner stabilityai/stable-diffusion-xl-refiner-1.0 models. As we now using 2 models at a time so we need to use DiffusionPipeline. which support the bundling of multiple independently-trained models, schedulers, and processors into a single end-to-end class.

import gradio as gr
from diffusers import DiffusionPipeline
import torch

# Load base and refiner models with specified parameters
base = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
)
base.to("cuda")
refiner = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-refiner-1.0",
    text_encoder_2=base.text_encoder_2,
    vae=base.vae,
    torch_dtype=torch.float16,
    use_safetensors=True,
    variant="fp16",
)
refiner.to("cuda")

# Define processing steps
n_steps = 40
high_noise_frac = 0.8

def query(payload):
    # Generate and refine image based on the text prompt
    image = base(prompt=payload, num_inference_steps=n_steps, denoising_end=high_noise_frac, output_type="latent").images
    image = refiner(prompt=payload, num_inference_steps=n_steps, denoising_start=high_noise_frac, image=image).images[0]
    return image

# Setup Gradio interface for refined images
demo = gr.Interface(fn=query, inputs="text", outputs="image")

if __name__ == "__main__":
    demo.launch(show_api=False)

Screenshots

Base and Refiner model Image Base and Refiner model Image 2

Conclusion

By following these steps, you can set up and use Stable Diffusion XL on your local machine. Whether you're a hobbyist or a professional, this tool offers an exciting way to explore AI-powered image generation.

Refer to the Hugging Face documentation for more advanced usage.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages