Custom Diffusion: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float

### Describe the bug

when running custom diffusion on my 20 photos repositories ... I run into this error that is related to data type difference..  

### Reproduction

!accelerate launch train_custom_diffusion.py \
--pretrained_model_name_or_path=$MODEL_NAME  \
--instance_data_dir=$INSTANCE_DIR \
--output_dir=$OUTPUT_DIR \
--class_data_dir=$class_data_dir \
--with_prior_preservation \
--prior_loss_weight=1.0 \
--class_prompt="person" \
--num_class_images=200 \
--instance_prompt="photo of a <new1> person"  \
--resolution=512  \
--train_batch_size=2  \
--learning_rate=5e-6  \
--lr_warmup_steps=0 \
--max_train_steps=1200 \
--freeze_model=crossattn \
--scale_lr \
--hflip  \
--use_8bit_adam \
--gradient_checkpointing \
--enable_xformers_memory_efficient_attention \
--modifier_token "<new1>" \
--validation_prompt="<new1> person sitting in a bucket" 

### Logs

```shell
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
02/06/2024 23:50:18 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: fp16

You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'thresholding', 'variance_type', 'dynamic_thresholding_ratio', 'clip_sample_range', 'sample_max_value', 'timestep_spacing', 'rescale_betas_zero_snr'} was not found in config. Values will be initialized to default values.
{'scaling_factor', 'force_upcast'} was not found in config. Values will be initialized to default values.
{'mid_block_only_cross_attention', 'cross_attention_norm', 'encoder_hid_dim', 'encoder_hid_dim_type', 'reverse_transformer_layers_per_block', 'attention_type', 'time_embedding_act_fn', 'projection_class_embeddings_input_dim', 'time_embedding_dim', 'mid_block_type', 'transformer_layers_per_block', 'class_embed_type', 'conv_out_kernel', 'class_embeddings_concat', 'addition_time_embed_dim', 'addition_embed_type', 'addition_embed_type_num_heads', 'conv_in_kernel', 'time_embedding_type', 'resnet_skip_time_act', 'num_attention_heads', 'resnet_out_scale_factor', 'resnet_time_scale_shift', 'time_cond_proj_dim', 'timestep_post_act', 'dropout'} was not found in config. Values will be initialized to default values.
[42170]
02/06/2024 23:52:45 - INFO - __main__ - ***** Running training *****
02/06/2024 23:52:45 - INFO - __main__ -   Num examples = 200
02/06/2024 23:52:45 - INFO - __main__ -   Num batches each epoch = 100
02/06/2024 23:52:45 - INFO - __main__ -   Num Epochs = 12
02/06/2024 23:52:45 - INFO - __main__ -   Instantaneous batch size per device = 2
02/06/2024 23:52:45 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 2
02/06/2024 23:52:45 - INFO - __main__ -   Gradient Accumulation steps = 1
02/06/2024 23:52:45 - INFO - __main__ -   Total optimization steps = 1200
Steps:   0%|                                           | 0/1200 [00:00<?, ?it/s]/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
Traceback (most recent call last):
  File "/home/anasrezklinux/test_pycharm_link/diffusers/examples/custom_diffusion/train_custom_diffusion.py", line 1350, in <module>
    main(args)
  File "/home/anasrezklinux/test_pycharm_link/diffusers/examples/custom_diffusion/train_custom_diffusion.py", line 1131, in main
    model_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/models/unets/unet_2d_condition.py", line 1121, in forward
    sample, res_samples = downsample_block(
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/models/unets/unet_2d_blocks.py", line 1189, in forward
    hidden_states = attn(
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/models/transformers/transformer_2d.py", line 379, in forward
    hidden_states = torch.utils.checkpoint.checkpoint(
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/_compile.py", line 24, in inner
    return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 489, in _fn
    return fn(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 17, in inner
    return fn(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 489, in checkpoint
    ret = function(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/models/transformers/transformer_2d.py", line 374, in custom_forward
    return module(*inputs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/models/attention.py", line 366, in forward
    attn_output = self.attn2(
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 512, in forward
    return self.processor(
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 1429, in __call__
    query = self.to_q_custom_diffusion(hidden_states).to(attn.to_q.weight.dtype)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 116, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float
Steps:   0%|                                           | 0/1200 [00:22<?, ?it/s]
Traceback (most recent call last):
  File "/home/anasrezklinux/.local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
    args.func(args)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
    simple_launcher(args)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_custom_diffusion.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-2-1', '--instance_data_dir=/mnt/c/Users/noobw/PycharmProjects/pythonProject/Anas', '--output_dir=/mnt/c/Users/noobw/PycharmProjects/pythonProject/custom_diffusion_anas', '--class_data_dir=/mnt/c/Users/noobw/PycharmProjects/pythonProject/custom_diffusion_anas/class_prior', '--with_prior_preservation', '--prior_loss_weight=1.0', '--class_prompt=person', '--num_class_images=200', '--instance_prompt=photo of a <new1> person', '--resolution=512', '--train_batch_size=2', '--learning_rate=5e-6', '--lr_warmup_steps=0', '--max_train_steps=1200', '--freeze_model=crossattn', '--scale_lr', '--hflip', '--use_8bit_adam', '--gradient_checkpointing', '--enable_xformers_memory_efficient_attention', '--modifier_token', '<new1>', '--validation_prompt=<new1> person sitting in a bucket']' returned non-zero exit status 1.
```


### System Info

- `diffusers` version: 0.26.1
- Platform: Linux-5.15.133.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Python version: 3.10.12
- PyTorch version (GPU?): 2.2.0+cu121 (True)
- Huggingface_hub version: 0.20.3
- Transformers version: 4.37.0
- Accelerate version: 0.25.0
- xFormers version: 0.0.24
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

@sayakpaul @patrickvonplaten

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Custom Diffusion: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float #6879

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Custom Diffusion: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float #6879

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions