You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to train a model using my custom dataset. However, I noticed that the current training process only supports using image IDs. Is there a way to provide a custom prompt for each image instead of using just the image ID?
If this feature is not currently available, is there a plan to include it in any upcoming releases?
Thank you!
The text was updated successfully, but these errors were encountered:
I suggest you that you can follow stable diffusion to use the embeddings of clip output and use cross-attention to change the condition.
Thank you for your reply. I understand the first part about using CLIP embeddings, but could you please clarify how you suggest changing the condition in the masked diffusion transformer code? Specifically, what modifications should I make to integrate the CLIP embeddings with the cross-attention mechanism in the training process?
Hi @gasvn,
I would like to train a model using my custom dataset. However, I noticed that the current training process only supports using image IDs. Is there a way to provide a custom prompt for each image instead of using just the image ID?
If this feature is not currently available, is there a plan to include it in any upcoming releases?
Thank you!
The text was updated successfully, but these errors were encountered: