Skip to content

Bug in Custom Diffusion training with concept list #6709

@AIshutin

Description

@AIshutin

Describe the bug

Context: train_custom_diffusion.py script in Custom Diffusion using concept list
Description: This script fails if I use concept list and try to synthesize images. In this case it tries to generate synthetic images with wrong or missing prompt. Note, that the primary use of concept list is for multiconcept training, but this example uses single concept for simplicity.

Bug: https://github.com/huggingface/diffusers/blob/main/examples/custom_diffusion/train_custom_diffusion.py#L756
Fix: 1b8972d

Reproduction

# move to relevant dir
cd examples/custom_diffusion/
# download example data from README.md
wget https://www.cs.cmu.edu/~custom-diffusion/assets/data.zip
unzip data.zip

Make concept_list.json file:

[
    {
        "instance_prompt":     "photo of a <new1> cat",
        "class_prompt":         "cat",
        "instance_data_dir":    "data/cat",
        "class_data_dir": "synth-dataset/cat"
    }
]

Run script:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export OUTPUT_DIR="path-to-save-model"

accelerate launch train_custom_diffusion.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --output_dir=$OUTPUT_DIR \
  --concepts_list=./concept_list.json \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --resolution=512  \
  --train_batch_size=2  \
  --learning_rate=1e-5  \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --num_class_images=200 \
  --scale_lr --hflip  \
  --modifier_token "<new1>" 

Logs

Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: bf16

{'image_encoder', 'requires_safety_checker'} was not found in config. Values will be initialized to default values.
Loading pipeline components...:   0%|                                                                                                                                                                                                                        | 0/6 [00:00<?, ?it/s]{'resnet_out_scale_factor', 'encoder_hid_dim_type', 'class_embeddings_concat', 'addition_embed_type', 'mid_block_only_cross_attention', 'projection_class_embeddings_input_dim', 'time_embedding_type', 'addition_time_embed_dim', 'resnet_time_scale_shift', 'transformer_layers_per_block', 'time_embedding_dim', 'dual_cross_attention', 'timestep_post_act', 'resnet_skip_time_act', 'reverse_transformer_layers_per_block', 'time_cond_proj_dim', 'class_embed_type', 'upcast_attention', 'only_cross_attention', 'encoder_hid_dim', 'attention_type', 'conv_in_kernel', 'cross_attention_norm', 'num_attention_heads', 'use_linear_projection', 'mid_block_type', 'time_embedding_act_fn', 'num_class_embeds', 'conv_out_kernel', 'addition_embed_type_num_heads', 'dropout'} was not found in config. Values will be initialized to default values.
Loaded unet as UNet2DConditionModel from `unet` subfolder of CompVis/stable-diffusion-v1-4.
Loading pipeline components...:  17%|██████████████████████████████████▋                                                                                                                                                                             | 1/6 [00:01<00:07,  1.54s/it]{'prediction_type', 'timestep_spacing'} was not found in config. Values will be initialized to default values.
Loaded scheduler as PNDMScheduler from `scheduler` subfolder of CompVis/stable-diffusion-v1-4.
Loaded feature_extractor as CLIPImageProcessor from `feature_extractor` subfolder of CompVis/stable-diffusion-v1-4.
Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of CompVis/stable-diffusion-v1-4.
Loading pipeline components...:  67%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                                     | 4/6 [00:01<00:00,  2.53it/s]{'force_upcast', 'norm_num_groups'} was not found in config. Values will be initialized to default values.
Loaded vae as AutoencoderKL from `vae` subfolder of CompVis/stable-diffusion-v1-4.
Loading pipeline components...:  83%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                  | 5/6 [00:02<00:00,  2.95it/s]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of CompVis/stable-diffusion-v1-4.
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:02<00:00,  2.71it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes-0.38.1-py3.10.egg/bitsandbytes/libbitsandbytes_cuda117.so
CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.7/lib64/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes-0.38.1-py3.10.egg/bitsandbytes/libbitsandbytes_cuda117.so...
01/25/2024 15:31:14 - INFO - __main__ - Number of class images to sample: 200.
Generating class images:   0%|                                                                                                                                                                                                                              | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 127, in collate
    return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 127, in <dictcomp>
    return elem_type({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 150, in collate
    raise TypeError(default_collate_err_msg_format.format(elem_type))
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'NoneType'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/aishutin/contribute/diffusers/examples/custom_diffusion/train_custom_diffusion.py", line 1350, in <module>
    main(args)
  File "/home/aishutin/contribute/diffusers/examples/custom_diffusion/train_custom_diffusion.py", line 762, in main
    for example in tqdm(
  File "/home/aishutin/.local/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/home/aishutin/.local/lib/python3.10/site-packages/accelerate/data_loader.py", line 451, in __iter__
    current_batch = next(dataloader_iter)
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 634, in __next__
    data = self._next_data()
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 678, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
    return self.collate_fn(data)
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 264, in default_collate
    return collate(batch, collate_fn_map=default_collate_fn_map)
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 130, in collate
    return {key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 130, in <dictcomp>
    return {key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem}
  File "/home/aishutin/.local/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py", line 150, in collate
    raise TypeError(default_collate_err_msg_format.format(elem_type))
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'NoneType'>
Traceback (most recent call last):
  File "/home/aishutin/.local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/aishutin/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
    args.func(args)
  File "/home/aishutin/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1023, in launch_command
    simple_launcher(args)
  File "/home/aishutin/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 643, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_custom_diffusion.py', '--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4', '--output_dir=path-to-save-model', '--concepts_list=./concept_list.json', '--with_prior_preservation', '--prior_loss_weight=1.0', '--resolution=512', '--train_batch_size=2', '--learning_rate=1e-5', '--lr_warmup_steps=0', '--max_train_steps=500', '--num_class_images=200', '--scale_lr', '--hflip', '--modifier_token', '<new1>']' returned non-zero exit status 1.

System Info

  • diffusers version: 0.26.0.dev0
  • Platform: Linux-6.5.0-14-generic-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.0.0+cu117 (True)
  • Huggingface_hub version: 0.20.3
  • Transformers version: 4.37.0
  • Accelerate version: 0.26.1
  • xFormers version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help?

@sayakpaul
@nupurkmr9

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions