Skip to content
Closed

Flux.1 #1331

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
9991f09
upgrade diffusers
Sep 14, 2024
9bbcc1b
replace schduler
Sep 18, 2024
cb2aaf0
update wkld entrypoint
Sep 18, 2024
4fcf181
rem demo wkld entrypoint
Sep 18, 2024
8e0a02f
add warp in hpu graph
Sep 23, 2024
154101e
upgrade diffusers
Sep 14, 2024
8759264
replace schduler
Sep 18, 2024
16848b3
update wkld entrypoint
Sep 18, 2024
073b6d0
rem demo wkld entrypoint
Sep 18, 2024
60c5de3
add warp in hpu graph
Sep 23, 2024
e187930
Add fp8 to flux and fix timing
dsocek Sep 26, 2024
66098ff
Enable batching for flux inference
dsocek Sep 26, 2024
0615ce1
update diffusers to adopt rope changes
Sep 27, 2024
97a6dd5
fix import error
Sep 27, 2024
f3f469c
fix readme conflict
Sep 27, 2024
9aefc5f
fix time clac drift
Sep 27, 2024
1bc593a
fix import error in lazy mode
Sep 27, 2024
3a53c95
Add hybrid fp8 and bf16 denoising to flux
dsocek Sep 27, 2024
0b80a6a
use default scheduler from upstream diffusers
Sep 29, 2024
f267958
fix import error
Sep 29, 2024
9bcc65f
Fix timing issue with batching
dsocek Oct 1, 2024
971ca4d
fix conflicts
Oct 8, 2024
1144815
Add FusedSDPA
splotnikv Oct 9, 2024
8145d20
Merge branch 'dsocek/flux' into kim/flux
Oct 10, 2024
479dc96
use latest attn rope
Oct 10, 2024
18c5960
fix scheduler
Oct 10, 2024
eee48c7
add OenFLUX.1
Oct 11, 2024
c299caa
fix errors
Oct 14, 2024
c0d391e
rem quant files
Oct 14, 2024
b5aee78
rem tmp tests files
Oct 14, 2024
44a48c7
rem text_ids image_ids from split into batches
Oct 14, 2024
6bd7351
fix guidance nan boolean tensor ambiguous err
huijuanzh Oct 22, 2024
40d12e7
upgrade diffusers to 0.31.0 relese version inrequ
Oct 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 49 additions & 6 deletions examples/stable-diffusion/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,12 @@ First, you should install the requirements:
pip install -r requirements.txt
```


## Text-to-image Generation

### Single Prompt

Here is how to generate images with one prompt:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -51,10 +51,10 @@ python text_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Multiple Prompts

Here is how to generate images with several prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -69,7 +69,9 @@ python text_to_image_generation.py \
```

### Distributed inference with multiple HPUs

Here is how to generate images with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand Down Expand Up @@ -109,10 +111,10 @@ python text_to_image_generation.py \
```

> There are two different checkpoints for Stable Diffusion 2:
>
> - use [stabilityai/stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1) for generating 768x768 images
> - use [stabilityai/stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) for generating 512x512 images


### Latent Diffusion Model for 3D (LDM3D)

[LDM3D](https://arxiv.org/abs/2305.10853) generates both image and depth map data from a given text prompt, allowing users to generate RGBD images from text prompts.
Expand All @@ -135,7 +137,9 @@ python text_to_image_generation.py \
--ldm3d \
--bf16
```

Here is how to generate images and depth maps with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand All @@ -154,6 +158,7 @@ python ../gaudi_spawn.py \
```

> There are three different checkpoints for LDM3D:
>
> - use [original checkpoint](https://huggingface.co/Intel/ldm3d) to generate outputs from the paper
> - use [the latest checkpoint](https://huggingface.co/Intel/ldm3d-4c) for generating improved results
> - use [the pano checkpoint](https://huggingface.co/Intel/ldm3d-pano) to generate panoramic view
Expand All @@ -163,6 +168,7 @@ python ../gaudi_spawn.py \
Stable Diffusion XL was proposed in [SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis](https://arxiv.org/pdf/2307.01952.pdf) by the Stability AI team.

Here is how to generate SDXL images with a single prompt:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -182,6 +188,7 @@ python text_to_image_generation.py \
> You can enable this mode with `--use_hpu_graphs`.

Here is how to generate SDXL images with several prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -199,6 +206,7 @@ python text_to_image_generation.py \
SDXL combines a second text encoder (OpenCLIP ViT-bigG/14) with the original text encoder to significantly
increase the number of parameters. Here is how to generate images with several prompts for both `prompt`
and `prompt_2` (2nd text encoder), as well as their negative prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
Expand All @@ -217,6 +225,7 @@ python text_to_image_generation.py \
```

Here is how to generate SDXL images with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand All @@ -235,14 +244,17 @@ python ../gaudi_spawn.py \
--bf16 \
--distributed
```

> HPU graphs are recommended when generating images by batches to get the fastest possible generations.
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.

### SDXL-Turbo

SDXL-Turbo is a distilled version of SDXL 1.0, trained for real-time synthesis.

Here is how to generate images with multiple prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/sdxl-turbo \
Expand Down Expand Up @@ -275,11 +287,13 @@ Before running SD3 pipeline, you need to:

1. Agree to the Terms and Conditions for using SD3 model at [HuggingFace model page](https://huggingface.co/stabilityai/stable-diffusion-3-medium)
2. Authenticate with HuggingFace using your HF Token. For authentication, run:

```bash
huggingface-cli login
```

Here is how to generate SD3 images with a single prompt:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python text_to_image_generation.py \
Expand All @@ -299,12 +313,32 @@ python text_to_image_generation.py \
> For improved performance of the SD3 pipeline on Gaudi, it is recommended to configure the environment
> by setting PT_HPU_MAX_COMPOUND_OP_SIZE to 1.

### FLUX.1

FLUX.1 was was introduced by Black Forest Labs [here](https://blackforestlabs.ai/announcing-black-forest-labs/)

```bash
python text_to_image_generation.py \
--model_name_or_path black-forest-labs/FLUX.1-schnell \
--prompts "A cat holding a sign that says hello world" \
--num_images_per_prompt 10 \
--batch_size 1 \
--num_inference_steps 28 \
--image_save_dir /tmp/flux_1_images \
--scheduler flow_match_euler_discrete\
--use_habana \
--use_hpu_graphs \
--gaudi_config Habana/stable-diffusion \
--bf16
```

## ControlNet

ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models ](https://huggingface.co/papers/2302.05543) by Lvmin Zhang and Maneesh Agrawala.
ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang and Maneesh Agrawala.
It is a type of model for controlling StableDiffusion by conditioning the model with an additional input image.

Here is how to generate images conditioned by canny edge model:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -321,6 +355,7 @@ python text_to_image_generation.py \
```

Here is how to generate images conditioned by canny edge model and with multiple prompts:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -337,6 +372,7 @@ python text_to_image_generation.py \
```

Here is how to generate images conditioned by canny edge model and with two prompts on two HPUs:

```bash
python ../gaudi_spawn.py \
--world_size 2 text_to_image_generation.py \
Expand All @@ -355,6 +391,7 @@ python ../gaudi_spawn.py \
```

Here is how to generate images conditioned by open pose model:

```bash
python text_to_image_generation.py \
--model_name_or_path CompVis/stable-diffusion-v1-4 \
Expand All @@ -372,6 +409,7 @@ python text_to_image_generation.py \
```

Here is how to generate images with conditioned by canny edge model using Stable Diffusion 2

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-2-1 \
Expand All @@ -395,6 +433,7 @@ Inpainting replaces or edits specific areas of an image. For more details,
please refer to [Hugging Face Diffusers doc](https://huggingface.co/docs/diffusers/en/using-diffusers/inpaint).

### Stable Diffusion Inpainting

```bash
python text_to_image_generation.py \
--model_name_or_path stabilityai/stable-diffusion-2-inpainting \
Expand All @@ -412,6 +451,7 @@ python text_to_image_generation.py \
```

### Stable Diffusion XL Inpainting

```bash
python text_to_image_generation.py \
--model_name_or_path diffusers/stable-diffusion-xl-1.0-inpainting-0.1\
Expand Down Expand Up @@ -457,10 +497,10 @@ python image_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Multiple Prompts

Here is how to generate images with several prompts and one image.

```bash
python image_to_image_generation.py \
--model_name_or_path "timbrooks/instruct-pix2pix" \
Expand All @@ -482,10 +522,10 @@ python image_to_image_generation.py \
> The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
> You can enable this mode with `--use_hpu_graphs`.


### Stable Diffusion XL Refiner

Here is how to generate SDXL images with a single prompt and one image:

```bash
python image_to_image_generation.py \
--model_name_or_path "stabilityai/stable-diffusion-xl-refiner-1.0" \
Expand All @@ -505,6 +545,7 @@ python image_to_image_generation.py \
### Stable Diffusion Image Variations

Here is how to generate images with one image, it does not accept prompt input

```bash
python image_to_image_generation.py \
--model_name_or_path "lambdalabs/sd-image-variations-diffusers" \
Expand Down Expand Up @@ -625,6 +666,7 @@ Script `image_to_video_generation.py` showcases how to perform image-to-video ge
### Single Image Prompt

Here is how to generate video with one image prompt:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python image_to_video_generation.py \
Expand All @@ -645,6 +687,7 @@ python image_to_video_generation.py \
### Multiple Image Prompts

Here is how to generate videos with several image prompts:

```bash
PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
python image_to_video_generation.py \
Expand Down
100 changes: 100 additions & 0 deletions examples/stable-diffusion/prompts_100.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
A women playing tennis on a blue tennis court.
Two surfers in wetsuits carrying surfboards along the beach.
People are flying their kites in a large field.
A statue that is in front of a building.
A man attempting to do a skateboard trick on an outdoor halfpipe.
A cream bathroom with red accents and open window.
A woman in a white dress and a man in gray stand near a cake on a white table under a white canopy.
A tour bus downtown with yoga ads all over it.
Three people riding horses on a beach next to the ocean..
A bear lying in its den on a pile of wood.
A herd of cattle is feeding at the river's edge.
there is a male snowboarder that is in the air
A statue of an elephant with tattoos and a target drawing
a woman is standing over a white cake
A grey motorcycle parked in a tropical setting.
Two dogs playing in the grass with a frisbee.
A group of people standing around a table full of food.
A person holding a piece of broccoli with an insect on it.
A man in a gray suit and a red tie.
Someone with skis on his back walking up a snow covered mountain.
a woman in a blue shirt holding a pair of large scissors
A fluffy white cat has a frowning look on it's face.
A bus that is sitting on the street.
Sinks in the washroom that is public and white.
this lady is walking along the shore on a beach
A woman helping a man to do his tie.
A tray with coffee and a pastry on it.
A man with a bald head and a bear wearing a bow tie.
A cat eating a birthday cake on top of the table.
A man at a party talking on a cell phone.
Three commuter buses sitting outside of a building.
Food truck with customers ordering them with friends.
A couple of bikes in front of a small stone wall.
Someone is enjoying a small slice of pie.
there is a small tv and coffee table in the living room
A plate with two sandwiches, cup and knife on the table
A group of people venturing out on a horseback ride.
A horse walking through a grassy field while two cows eat hay.
A bathroom scene with focus on the toilet.
A man on a field swinging a baseball bat.
Some people are standing on a crowd crowded sidewalk
A pregnant women taking a picture of herself in the mirror.
A line of police offices riding horses down a street.
A buffet styled restaurant without self service but a server.
A tview of a living room with fold out bed.
A surfboard advertising offerings as people check them out.
a bath room with a sink a mirror and towel racks
A desk with a computer on in and a key board
A large clock mounted on the wall of a stone building
Three men sitting around a table with wine on it.
Man with glasses and a mustache standing in front of a door.
A woman in a black dress holding a racquet.
This hotel room has a king size bed.
A severely injured man hooked up to machines in the hospital
A black and white picture of an old store.
Fingers keep a meatball sub from falling apart.
A plate of fish covered in marinara, cheese, carrots, a fork, next to bread.
A anal filled with boats and the street above it filled with people under umbrellas.
A woman standing on a tennis court holding a racquet.
An elephant is standing next to a tree and a fence.
People in a street with birds all over.
THERE ARE PEOPLE THAT ARE STANDING IN THE GRASS
a man standing by a fence while throwing a frisbee
A pot that is on the stove with some food in it.
A stuffed animal is inside of a microwave.
Group of parents watching small children on a baseball field.
The kite is flying high in the air
This cat is playing on a fuzzy white blanket.
A lone giraffe at a zoo with trees behind it.
A train engine carrying carts down a track.
a young man brushes his teeth in the bathroom
Two children playing baseball in red uniforms and hats.
Men lined up an a runway in a desert greet an arriving jet plane.
The American flag flies next to the clock tower on a snowy day.
A man takes a selfie of himself in the mirror.
A baseball game where a player is running to 3rd base.
A man and a woman standing in front of a bus.
A plate of dessert sitting beside a drink in a cafe.
A man on the couch is petting the dog
A cat is sleeping with a remote control on a couch.
A European fighter jet flying above the tree tops.
Snowman's head has a carrot for a nose and lemon slices for eyes.
A man standing on a tennis court holding a tennis racquet.
Two large green and white jumbo jet planes on the tarmac.
The bedroom is is decorated in various zebra prints.
Dog displaying skills near disc in open grassy area.
A toilet facility in a stone cell on a plank floor.
A picture done by Independent Expression Photography of a girl posing in an empty road sitting on her suit cases.
Pair of colorful stuffed bears hanging on line in backyard.
a large train is on the track going by the ocean
There is a pizza with olives, peppers, meat, and cheese on the table.
A man standing in front of microphones.
A woman that is holding a book sitting on a bed.
A man showing a ring at a formal event.
A concrete building with towers, a steep in the middle and a clock underneath.
A white dog standing on top of a wooden bench.
A woman holds a plate with rainbow cake.
A women who is taking a picture of her food.
A man with long hair and in a towel holding a toothbrush.
a woman holding a tennis racket in the air
5 changes: 5 additions & 0 deletions examples/stable-diffusion/prompts_5.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
A women playing tennis on a blue tennis court.
Two surfers in wetsuits carrying surfboards along the beach.
People are flying their kites in a large field.
A statue that is in front of a building.
A man attempting to do a skateboard trick on an outdoor halfpipe.
5 changes: 5 additions & 0 deletions examples/stable-diffusion/quantize/measure_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"method": "HOOKS",
"mode": "MEASURE",
"dump_stats_path": "quantize/measure_all/fp8"
}
6 changes: 6 additions & 0 deletions examples/stable-diffusion/quantize/quant_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"method": "HOOKS",
"mode": "QUANTIZE",
"scale_method": "maxabs_hw_opt_weight",
"dump_stats_path": "quantize/measure_all/fp8"
}
6 changes: 6 additions & 0 deletions examples/stable-diffusion/quantize/quant_config_500.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"method": "HOOKS",
"mode": "QUANTIZE",
"scale_method": "maxabs_hw_opt_weight",
"dump_stats_path": "quantize/measure_all_500/fp8"
}
7 changes: 7 additions & 0 deletions examples/stable-diffusion/quantize/quant_config_bmm.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"method": "HOOKS",
"mode": "QUANTIZE",
"scale_method": "maxabs_hw_opt_weight",
"dump_stats_path": "quantize/measure_all/fp8",
"blocklist": {"types": ["Linear", "Conv2d", "LoRACompatibleLinear", "LoRACompatibleConv"]}
}
Loading