You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recent advancements in fine-tuning techniques for text-to-image (T2I) personalization still struggle to distill visual concepts from reference images when there are both image-wide and spatially localized concepts present in each reference image.
The techniques in this paper are designed to improve component-controllable personalization, a novel task that pushes the boundaries of T2I models by allowing users to reconfigure specific components when personalizing visual concepts. This task is particularly challenging due to two primary obstacles: semantic pollution, where unwanted visual elements corrupt spatially localized concepts, and semantic imbalance, which causes disproportionate learning between the custom image-wide concept and spatially localized component concepts.
To overcome these challenges, MagicTailor leverages Dynamic Masked Degradation (DM-Deg) to dynamically perturb undesired visual semantics and Dual-Stream Balancing (DS-Bal) to establish a balanced learning paradigm for desired visual semantics.
Open source status
The model implementation is available.
The model weights are available (Only relevant if addition is not a scheduler).
Model/Pipeline/Scheduler description
Recent advancements in fine-tuning techniques for text-to-image (T2I) personalization still struggle to distill visual concepts from reference images when there are both image-wide and spatially localized concepts present in each reference image.
The techniques in this paper are designed to improve component-controllable personalization, a novel task that pushes the boundaries of T2I models by allowing users to reconfigure specific components when personalizing visual concepts. This task is particularly challenging due to two primary obstacles: semantic pollution, where unwanted visual elements corrupt spatially localized concepts, and semantic imbalance, which causes disproportionate learning between the custom image-wide concept and spatially localized component concepts.
To overcome these challenges, MagicTailor leverages Dynamic Masked Degradation (DM-Deg) to dynamically perturb undesired visual semantics and Dual-Stream Balancing (DS-Bal) to establish a balanced learning paradigm for desired visual semantics.
Open source status
Provide useful links for the implementation
Paper: https://arxiv.org/pdf/2410.13370
Project Website: https://correr-zhou.github.io/MagicTailor/
Code: https://github.com/correr-zhou/MagicTailor
Contact: @Correr-Zhou
The text was updated successfully, but these errors were encountered: