[Model] Adapt Wan2.2-I2V-A14B via LightX2V offline conversion path#2134
Conversation
23637f6 to
7c91a46
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 23637f6a10
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if path.is_file(): | ||
| return WeightSpec(kind="single", single_file=path) |
There was a problem hiding this comment.
Validate file inputs before treating as single checkpoint
_resolve_weight_spec classifies any existing file as a single weight file, so passing a sharded index file like diffusion_pytorch_model.safetensors.index.json (which is a plausible “checkpoint file” per the CLI help) is accepted as kind="single". In that path, the assembler copies only the index JSON, skips shard validation, and _validate_output still reports success, producing an output directory that is missing actual tensor shards and cannot be loaded. Please special-case *.index.json as sharded input (with shard checks) or reject non-weight file extensions.
Useful? React with 👍 / 👎.
|
Thanks for your contribution. Could you also attach the generated video result for this PR? |
|
Can you provide visual output and have relevant metrics compared with existing Wan2.2-I2V-A14B solution? |
| @@ -0,0 +1,140 @@ | |||
| # Wan2.2 I2V LightX2V Conversion | |||
There was a problem hiding this comment.
Can you put this file under img2video section instead of creating a standalone page?
There was a problem hiding this comment.
@lishunyang12 Oh, moving it to image_to_video/README.md leads to minors problems in CI. Let's find a better place for ti.
There was a problem hiding this comment.
Hi, the doc for your model example should be placed under examples/offline_inference/image_to_video/, then run mkdocs serve (it will auto-generate this docs/user_guide/examples/offline_inference/wan22_i2v_lightx2v_conversion.md for you).
And you can also check whether the layout of this page is acceptable on this local mkdocs server; the address is http://127.0.0.1:8000/vllm-omni/.
There was a problem hiding this comment.
thank you. I’ve updated this according to your suggestion. But I'm not sure if I'm doing this right. please check.
cbb4bb9 to
c360806
Compare
|
Wan2.2-I2V-A14B-Diffuser
i2v_output-diffuser325.mp4Wan2.2-I2V-A14B-LightX2V-Diffusers (Wan2.2-I2V-A14B-Lightning+Wan2.2-Distill-Loras)
i2v_output-lightning-lora.mp4Wan2.2-I2V-A14B-LightX2V-Diffusers (Wan2.2-I2V-A14B+Wan2.2-Distill-Loras)
i2v_output-base-lora.mp4 |
11d4f43 to
1b7274c
Compare
Signed-off-by: Celeste-jq <591998922@qq.com>
Signed-off-by: Celeste-jq <591998922@qq.com>
Signed-off-by: Celeste-jq <591998922@qq.com>
Signed-off-by: Celeste-jq <591998922@qq.com>
Signed-off-by: Celeste-jq <591998922@qq.com>
Signed-off-by: Celeste-jq <591998922@qq.com>
Signed-off-by: Celeste-jq <591998922@qq.com>
Signed-off-by: Celeste-jq <591998922@qq.com>
1b7274c to
80c8a18
Compare
|
LGTM |
| - Base model: `Wan-AI/Wan2.2-I2V-A14B` | ||
| - Diffusers skeleton: `Wan-AI/Wan2.2-I2V-A14B-Diffusers` | ||
| - LoRA weights: `lightx2v/Wan2.2-Distill-Loras` | ||
| - LightX2V converter: `tools/convert/converter.py` |
| @@ -0,0 +1,339 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
This tool seems only work with lora weight. Is it possible to decouple the lora part and make it work for both load and non-lora?
Use a single assemble_wan22_i2v_diffusers.py tool that supports both new and legacy LightX2V argument names, and update README guidance to avoid referencing converter files in this repo. Made-with: Cursor Signed-off-by: Celeste-jq <591998922@qq.com>
db6c171 to
a838e33
Compare
|
@gcanlin PTAL |
|
This PR modified the file |
Agree. We need it. |
|
@wtomin @SamitHuang PTAL |
c3f5635 to
2bd60c8
Compare
Signed-off-by: Celeste-jq <591998922@qq.com>
2bd60c8 to
4632284
Compare
Signed-off-by: Celeste-jq <591998922@qq.com> # Conflicts: # vllm_omni/engine/async_omni_engine.py
|
@gcanlin PTAL |
…llm-project#2134) Signed-off-by: Celeste-jq <591998922@qq.com> Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>
…llm-project#2134) Signed-off-by: Celeste-jq <591998922@qq.com> Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>
…llm-project#2134) Signed-off-by: Celeste-jq <591998922@qq.com> Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>

Purpose
Adapt
Wan-AI/Wan2.2-I2V-A14Bto the existing vLLM-Omni Diffusers runtime path via an offline conversion workflow, instead of extending runtime protocol for online dual-LoRA loading.This addresses the integration path discussed in #2093.
What this PR changes
Add offline assembly helper for Wan2.2 LightX2V outputs:
tools/wan22/assemble_lightx2v_wan22_i2v_diffusers.py*.index.json+ shards)Add Wan2.2 loader compatibility for converted checkpoint key variants:
vllm_omni/diffusion/models/wan2_2/wan2_2_transformer.pyblocks.N.modulation->blocks.N.scale_shift_tableAdd Wan2.2 sampling controls needed by LightX2V-distilled setups:
sample_solverswitch (unipc/euler)flow_shiftsupportboundary_ratioplumbing in default diffusion stage pathvllm_omni/diffusion/models/wan2_2/scheduling_wan_euler.py)vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2.pyvllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_i2v.pyvllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.pyvllm_omni/engine/async_omni_engine.pyexamples/offline_inference/image_to_video/image_to_video.pyAdd user guide and nav entry:
docs/user_guide/examples/offline_inference/wan22_i2v_lightx2v_conversion.mddocs/.nav.ymlInclude a docs build compatibility fix that is currently part of this branch:
mkdocs.ymlinventory URL adjustmentWhy this approach
Euler note for LightX2V Distill
For
Wan2.2-I2V-A14B + Wan2.2-Distill-Loras(4-step distilled LoRAs),sample_solver=euleris important for quality stability.In practice on this setup:
num_inference_steps=4sample_solver=eulerflow_shift=12(for 480p)guidance_scale=1.0guidance_scale_high=1.0boundary_ratio=0.875produces significantly better visual quality than using the previous default sampling setup.
Test Plan
python converter.py --source /home/xx/Wan-AI/Wan2.2-I2V-A14B/high_noise_model --output /home/xx/Wan-AI/wan22_lightx2v/high_noise_out --output_ext .safetensors --output_name diffusion_pytorch_model --model_type wan_dit --direction forward --lora_path /home/xx/Wan-AI/Wan2.2-Distill-Loras/wan2.2_i2v_A14b_high_noise_lora_rank64_lightx2v_4step_1022.safetensors --lora_key_convert auto --single_file --device cpu python converter.py --source /home/xx/Wan-AI/Wan2.2-I2V-A14B/low_noise_model --output /home/xx/Wan-AI/wan22_lightx2v/low_noise_out --output_ext .safetensors --output_name diffusion_pytorch_model --model_type wan_dit --direction forward --lora_path /home/xx/Wan-AI/Wan2.2-Distill-Loras/wan2.2_i2v_A14b_low_noise_lora_rank64_lightx2v_4step_1022.safetensors --lora_key_convert auto --single_file --device cpu python tools/wan22/assemble_lightx2v_wan22_i2v_diffusers.py \ --diffusers-skeleton /home/xx/Wan-AI/Wan2.2-I2V-A14B-Diffusers \ --high-noise-weight /home/xx/Wan-AI/wan22_lightx2v/high_noise_out \ --low-noise-weight /home/xx/Wan-AI/wan22_lightx2v/low_noise_out \ --output-dir /home/xx/Wan-AI/Wan2.2-I2V-A14B-LightX2V-Diffusers python examples/offline_inference/image_to_video/image_to_video.py --model /home/xx/Wan-AI/Wan2.2-I2V-A14B-LightX2V-Diffusers --image /home/xx/vllm_public_assets/images.jpg --prompt "A cat playing with yarn" --num-frames 81 --num-inference-steps 4 --tensor-parallel-size 4 --height 480 --width 832 --flow-shift 12 --sample-solver euler --guidance-scale 1.0 --guidance-scale-high 1.0 --boundary-ratio 0.875Test Result
/home/xx/Wan-AI/Wan2.2-I2V-A14B-LightX2V-Diffusersi2v_output.mp438.78s4 steps, around4.70s/step38.06 GB reserved,33.94 GB allocatedRaw execution log:
Notes