enable Qwen image Layered on Gaudi.#383
Conversation
|
|
||
|
|
||
| device = 'hpu' | ||
| model_path = '/mnt/ceph1/libo/hf_models/Qwen-Image-Layered/' |
There was a problem hiding this comment.
Change the path to something like /data/Qwen-Image-Layered or the default huggingface model id
|
|
||
| from transformers import Qwen2_5_VLForConditionalGeneration, Qwen2Tokenizer, Qwen2VLProcessor | ||
|
|
||
| from optimum.habana.transformers.models import GaudiQwen2_5_VLForConditionalGeneration |
| from transformers import Qwen2_5_VLForConditionalGeneration, Qwen2Tokenizer, Qwen2VLProcessor | ||
|
|
||
| from optimum.habana.transformers.models import GaudiQwen2_5_VLForConditionalGeneration | ||
| from optimum.habana.diffusers.models.qwenimage_transformer import QwenImageTransformer2DModelGaudi,QwenImageTransformerBlockForwardGaudi |
There was a problem hiding this comment.
Add a space between QwenImageTransformer2DModelGaudi, and QwenImageTransformerBlockForwardGaudi
| if use_hpu_graphs: | ||
| logger.warning( | ||
| "WARNING:!!!GaudiQwenImageLayeredPipeline HPU graph mode may have OOM problem when image size changes. Please set use_hpu_graphs=False!!!" | ||
| ) |
There was a problem hiding this comment.
Set use_hpu_graphs=False here to avoid OOM issue
| self.transformer.hidden_states_buckets_step = hidden_states_buckets_step | ||
| self.transformer.encoder_hidden_states_buckets_step = encoder_hidden_states_buckets_step | ||
|
|
||
| #self.to(self._device) |
| from diffusers.utils.torch_utils import randn_tensor | ||
| from diffusers.pipelines.qwenimage import QwenImagePipelineOutput | ||
| from diffusers.models.autoencoders.autoencoder_kl_qwenimage import QwenImageAttentionBlock | ||
| from diffusers.pipelines.qwenimage.pipeline_qwenimage_layered import QwenImageLayeredPipeline,calculate_dimensions,retrieve_timesteps |
| theta=10000, axes_dim=list(config["axes_dims_rope"]), scale_rope=True | ||
| ) | ||
|
|
||
| self.vae_decode_latents_buckets = [128,160,188] |
There was a problem hiding this comment.
Please test different image sizes, and check if these default buckets value is suitable for this model.
| f"vae_decode_latents_buckets is {self.vae_decode_latents_buckets}." | ||
| ) | ||
|
|
||
| self.vae_encode_buckets = [1024,1280,1504] |
There was a problem hiding this comment.
Please test different image sizes, and check if these default buckets value is suitable for this model.
| layer.forwward = types.MethodType(QwenImageAttentionBlockForwardGaudi, layer) | ||
|
|
||
| config = self.transformer.config | ||
| self.transformer.pos_embed = GaudiQwenEmbedRope( |
There was a problem hiding this comment.
Be careful of new PR, not everything can be applied from old PR. According to Qwen-Image-Layered Diffusers PR, it uses QwenEmbedLayer3DRope, not QwenEmbedRope. Also please check the other places to make sure everything is migrated.
No description provided.