huggingface · KimYannn · Sep 14, 2024 · Sep 18, 2024 · Sep 18, 2024 · Sep 18, 2024
@@ -28,12 +28,12 @@ First, you should install the requirements:
 pip install -r requirements.txt
 ```
 
-
 ## Text-to-image Generation
 
 ### Single Prompt
 
 Here is how to generate images with one prompt:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path CompVis/stable-diffusion-v1-4 \
@@ -51,10 +51,10 @@ python text_to_image_generation.py \
 > The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
 > You can enable this mode with `--use_hpu_graphs`.
 
-
 ### Multiple Prompts
 
 Here is how to generate images with several prompts:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path CompVis/stable-diffusion-v1-4 \
@@ -69,7 +69,9 @@ python text_to_image_generation.py \
 ```
 
 ### Distributed inference with multiple HPUs
+
 Here is how to generate images with two prompts on two HPUs:
+
 ```bash
 python ../gaudi_spawn.py \
     --world_size 2 text_to_image_generation.py \
@@ -109,10 +111,10 @@ python text_to_image_generation.py \
 ```
 
 > There are two different checkpoints for Stable Diffusion 2:
+>
 > - use [stabilityai/stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1) for generating 768x768 images
 > - use [stabilityai/stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) for generating 512x512 images
 
-
 ### Latent Diffusion Model for 3D (LDM3D)
 
 [LDM3D](https://arxiv.org/abs/2305.10853) generates both image and depth map data from a given text prompt, allowing users to generate RGBD images from text prompts.
@@ -135,7 +137,9 @@ python text_to_image_generation.py \
     --ldm3d \
     --bf16
 ```
+
 Here is how to generate images and depth maps with two prompts on two HPUs:
+
 ```bash
 python ../gaudi_spawn.py \
     --world_size 2 text_to_image_generation.py \
@@ -154,6 +158,7 @@ python ../gaudi_spawn.py \
 ```
 
 > There are three different checkpoints for LDM3D:
+>
 > - use [original checkpoint](https://huggingface.co/Intel/ldm3d) to generate outputs from the paper
 > - use [the latest checkpoint](https://huggingface.co/Intel/ldm3d-4c) for generating improved results
 > - use [the pano checkpoint](https://huggingface.co/Intel/ldm3d-pano) to generate panoramic view
@@ -163,6 +168,7 @@ python ../gaudi_spawn.py \
 Stable Diffusion XL was proposed in [SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis](https://arxiv.org/pdf/2307.01952.pdf) by the Stability AI team.
 
 Here is how to generate SDXL images with a single prompt:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
@@ -182,6 +188,7 @@ python text_to_image_generation.py \
 > You can enable this mode with `--use_hpu_graphs`.
 
 Here is how to generate SDXL images with several prompts:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
@@ -199,6 +206,7 @@ python text_to_image_generation.py \
 SDXL combines a second text encoder (OpenCLIP ViT-bigG/14) with the original text encoder to significantly
 increase the number of parameters. Here is how to generate images with several prompts for both `prompt`
 and `prompt_2` (2nd text encoder), as well as their negative prompts:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path stabilityai/stable-diffusion-xl-base-1.0 \
@@ -217,6 +225,7 @@ python text_to_image_generation.py \
 ```
 
 Here is how to generate SDXL images with two prompts on two HPUs:
+
 ```bash
 python ../gaudi_spawn.py \
     --world_size 2 text_to_image_generation.py \
@@ -235,14 +244,17 @@ python ../gaudi_spawn.py \
     --bf16 \
     --distributed
 ```
+
 > HPU graphs are recommended when generating images by batches to get the fastest possible generations.
 > The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
 > You can enable this mode with `--use_hpu_graphs`.
 
 ### SDXL-Turbo
+
 SDXL-Turbo is a distilled version of SDXL 1.0, trained for real-time synthesis.
 
 Here is how to generate images with multiple prompts:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path stabilityai/sdxl-turbo \
@@ -275,11 +287,13 @@ Before running SD3 pipeline, you need to:
 
 1. Agree to the Terms and Conditions for using SD3 model at [HuggingFace model page](https://huggingface.co/stabilityai/stable-diffusion-3-medium)
 2. Authenticate with HuggingFace using your HF Token. For authentication, run:
+
 ```bash
 huggingface-cli login
 ```
 
 Here is how to generate SD3 images with a single prompt:
+
 ```bash
 PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
 python text_to_image_generation.py \
@@ -299,12 +313,32 @@ python text_to_image_generation.py \
 > For improved performance of the SD3 pipeline on Gaudi, it is recommended to configure the environment
 > by setting PT_HPU_MAX_COMPOUND_OP_SIZE to 1.
 
+### FLUX.1
+
+FLUX.1 was was introduced by Black Forest Labs [here](https://blackforestlabs.ai/announcing-black-forest-labs/)
+
+```bash
+python text_to_image_generation.py \
+    --model_name_or_path black-forest-labs/FLUX.1-schnell \
+    --prompts "A cat holding a sign that says hello world" \
+    --num_images_per_prompt 10 \
+    --batch_size 1 \
+    --num_inference_steps 28 \
+    --image_save_dir /tmp/flux_1_images \
+    --scheduler flow_match_euler_discrete\
+    --use_habana \
+    --use_hpu_graphs \
+    --gaudi_config Habana/stable-diffusion \
+    --bf16
+```
+
 ## ControlNet
 
-ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models ](https://huggingface.co/papers/2302.05543) by Lvmin Zhang and Maneesh Agrawala.
+ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang and Maneesh Agrawala.
 It is a type of model for controlling StableDiffusion by conditioning the model with an additional input image.
 
 Here is how to generate images conditioned by canny edge model:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path CompVis/stable-diffusion-v1-4 \
@@ -321,6 +355,7 @@ python text_to_image_generation.py \
 ```
 
 Here is how to generate images conditioned by canny edge model and with multiple prompts:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path CompVis/stable-diffusion-v1-4 \
@@ -337,6 +372,7 @@ python text_to_image_generation.py \
 ```
 
 Here is how to generate images conditioned by canny edge model and with two prompts on two HPUs:
+
 ```bash
 python ../gaudi_spawn.py \
     --world_size 2 text_to_image_generation.py \
@@ -355,6 +391,7 @@ python ../gaudi_spawn.py \
 ```
 
 Here is how to generate images conditioned by open pose model:
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path CompVis/stable-diffusion-v1-4 \
@@ -372,6 +409,7 @@ python text_to_image_generation.py \
 ```
 
 Here is how to generate images with conditioned by canny edge model using Stable Diffusion 2
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path stabilityai/stable-diffusion-2-1 \
@@ -395,6 +433,7 @@ Inpainting replaces or edits specific areas of an image. For more details,
 please refer to [Hugging Face Diffusers doc](https://huggingface.co/docs/diffusers/en/using-diffusers/inpaint).
 
 ### Stable Diffusion Inpainting
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path  stabilityai/stable-diffusion-2-inpainting \
@@ -412,6 +451,7 @@ python text_to_image_generation.py \
 ```
 
 ### Stable Diffusion XL Inpainting
+
 ```bash
 python text_to_image_generation.py \
     --model_name_or_path  diffusers/stable-diffusion-xl-1.0-inpainting-0.1\
@@ -457,10 +497,10 @@ python image_to_image_generation.py \
 > The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
 > You can enable this mode with `--use_hpu_graphs`.
 
-
 ### Multiple Prompts
 
 Here is how to generate images with several prompts and one image.
+
 ```bash
 python image_to_image_generation.py \
     --model_name_or_path "timbrooks/instruct-pix2pix" \
@@ -482,10 +522,10 @@ python image_to_image_generation.py \
 > The first batch of images entails a performance penalty. All subsequent batches will be generated much faster.
 > You can enable this mode with `--use_hpu_graphs`.
 
-
 ### Stable Diffusion XL Refiner
 
 Here is how to generate SDXL images with a single prompt and one image:
+
 ```bash
 python image_to_image_generation.py \
     --model_name_or_path "stabilityai/stable-diffusion-xl-refiner-1.0" \
@@ -505,6 +545,7 @@ python image_to_image_generation.py \
 ### Stable Diffusion Image Variations
 
 Here is how to generate images with one image, it does not accept prompt input
+
 ```bash
 python image_to_image_generation.py \
     --model_name_or_path "lambdalabs/sd-image-variations-diffusers" \
@@ -625,6 +666,7 @@ Script `image_to_video_generation.py` showcases how to perform image-to-video ge
 ### Single Image Prompt
 
 Here is how to generate video with one image prompt:
+
 ```bash
 PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
 python image_to_video_generation.py \
@@ -645,6 +687,7 @@ python image_to_video_generation.py \
 ### Multiple Image Prompts
 
 Here is how to generate videos with several image prompts:
+
 ```bash
 PT_HPU_MAX_COMPOUND_OP_SIZE=1 \
 python image_to_video_generation.py \

@@ -0,0 +1,100 @@
+A women playing tennis on a blue tennis court.
+Two surfers in wetsuits carrying surfboards along the beach.
+People are flying their kites in a large field.
+A statue that is in front of a building.
+A man attempting to do a skateboard trick on an outdoor halfpipe.
+A cream bathroom with red accents and open window.
+A woman in a white dress and a man in gray stand near a cake on a white table under a white canopy.
+A tour bus downtown with yoga ads all over it.
+Three people riding horses on a beach next to the ocean..
+A bear lying in its den on a pile of wood.
+A herd of cattle is feeding at the river's edge.
+there is a male snowboarder that is in the air
+A statue of an elephant with tattoos and a target drawing
+a woman is standing over a white cake
+A grey motorcycle parked in a tropical setting.
+Two dogs playing in the grass with a frisbee.
+A group of people standing around a table full of food.
+A person holding a piece of broccoli with an insect on it. 
+A man in a gray suit and a red tie.
+Someone with skis on his back walking up a snow covered mountain.
+a woman in a blue shirt holding a pair of large scissors
+A fluffy white cat has a frowning look on it's face.
+A bus that is sitting on the street.
+Sinks in the washroom that is public and white.
+this lady is walking along the shore on a beach
+A woman helping a man to do his tie.
+A tray with coffee and a pastry on it.
+A man with a bald head and a bear wearing a bow tie.
+A cat eating a birthday cake on top of the table. 
+A man at a party talking on a cell phone.
+Three commuter buses sitting outside of a building.
+Food truck with customers ordering them with friends.
+A couple of bikes in front of a small stone wall.
+Someone is enjoying a small slice of pie. 
+there is a small tv and coffee table in the living room
+A plate with two sandwiches, cup and knife on the table
+A group of people venturing out on a horseback ride.
+A horse walking through a grassy field while two cows eat hay. 
+A bathroom scene with focus on the toilet.
+A man on a field swinging a baseball bat.
+Some people are standing on a crowd crowded sidewalk
+A pregnant women taking a picture of herself in the mirror.
+A line of police offices riding horses down a street.
+A buffet styled restaurant without self service but a server.
+A tview of a living room with fold out bed.
+A surfboard advertising offerings as people check them out.
+a bath room with a sink a mirror and towel racks
+A desk with a computer on in and a key board
+A large clock mounted on the wall of a stone building
+Three men sitting around a table with wine on it. 
+Man with glasses and a mustache standing in front of a door.
+A woman in a black dress holding a racquet.
+This hotel room has a king size bed. 
+A severely injured man hooked up to machines  in the hospital
+A black and white picture of an old store.
+Fingers keep a meatball sub from falling apart.
+A plate of fish covered in marinara, cheese, carrots, a fork, next to bread.
+A anal filled with boats and the street above it filled with people under umbrellas. 
+A woman standing on a tennis court holding a racquet.
+An elephant is standing next to a tree and a fence.
+People in a street with birds all over.
+THERE ARE PEOPLE THAT ARE STANDING IN THE GRASS
+a man standing by a fence while throwing a frisbee
+A pot that is on the stove with some food in it.
+A stuffed animal is inside of a microwave.
+Group of parents watching small children on a baseball field. 
+The kite is flying  high in the air 
+This cat is playing on a fuzzy white blanket.
+A lone giraffe at a zoo with trees behind it.
+A train engine carrying carts down a track.
+a young man brushes his teeth in the bathroom
+Two children playing baseball in red uniforms and hats.
+Men lined up an a runway in a desert greet an arriving jet plane.
+The American flag flies next to the clock tower on a snowy day. 
+A man takes a selfie of himself in the mirror. 
+A baseball game where a player is running to 3rd base.
+A man and a woman standing in front of a bus.
+A plate of dessert sitting beside a drink in a cafe.
+A man on the couch is petting the dog
+A cat is sleeping with a remote control on a couch.
+A European fighter jet flying above the tree tops.
+Snowman's head has a carrot for a nose and lemon slices for eyes.
+A man standing on a tennis court holding a tennis racquet.
+Two large green and white jumbo jet planes on the tarmac.
+The bedroom is is decorated in various zebra prints.
+Dog displaying skills near disc in open grassy area.
+A toilet facility in a stone cell on a plank floor.
+A picture done by Independent Expression Photography of a girl posing in an empty road sitting on her suit cases.
+Pair of colorful stuffed bears hanging on line in backyard.
+a large train is on the track going by the ocean
+There is a pizza with olives, peppers, meat, and cheese on the table.
+A man standing in front of microphones. 
+A woman that is holding a book sitting on a bed.
+A man showing a ring at a formal event.
+A concrete building with towers, a steep in the middle and a clock underneath.
+A white dog standing on top of a wooden bench.
+A woman holds a plate with rainbow cake.
+A women who is taking a picture of her food.
+A man with long hair and in a towel holding a toothbrush.
+a woman holding a tennis racket in the air 
@@ -0,0 +1,5 @@
+A women playing tennis on a blue tennis court.
+Two surfers in wetsuits carrying surfboards along the beach.
+People are flying their kites in a large field.
+A statue that is in front of a building.
+A man attempting to do a skateboard trick on an outdoor halfpipe.
@@ -0,0 +1,5 @@
+{
+    "method": "HOOKS",
+    "mode": "MEASURE",
+    "dump_stats_path": "quantize/measure_all/fp8"
+}
@@ -0,0 +1,6 @@
+{
+    "method": "HOOKS",
+    "mode": "QUANTIZE",
+    "scale_method": "maxabs_hw_opt_weight",
+    "dump_stats_path": "quantize/measure_all/fp8"
+}
@@ -0,0 +1,6 @@
+{
+    "method": "HOOKS",
+    "mode": "QUANTIZE",
+    "scale_method": "maxabs_hw_opt_weight",
+    "dump_stats_path": "quantize/measure_all_500/fp8"
+}
@@ -0,0 +1,7 @@
+{
+    "method": "HOOKS",
+    "mode": "QUANTIZE",
+    "scale_method": "maxabs_hw_opt_weight",
+    "dump_stats_path": "quantize/measure_all/fp8",
+    "blocklist": {"types": ["Linear", "Conv2d", "LoRACompatibleLinear", "LoRACompatibleConv"]}
+}