Skip to content

[codex][release/v0.18.0.post1] revert Wan2.2 pipeline changes from #2878#2937

Merged
david6666666 merged 1 commit into
vllm-project:release/v0.18.0.post1from
david6666666:codex/revert-pr2878-wan22-release-v0180p1
Apr 20, 2026
Merged

[codex][release/v0.18.0.post1] revert Wan2.2 pipeline changes from #2878#2937
david6666666 merged 1 commit into
vllm-project:release/v0.18.0.post1from
david6666666:codex/revert-pr2878-wan22-release-v0180p1

Conversation

@david6666666
Copy link
Copy Markdown
Collaborator

Summary

  • revert the Wan2.2 pipeline behavior that was backported through #2878
  • keep the later #2854 release-branch optimization in pipeline_wan2_2_i2v.py
  • drop the Wan2.2 max-sequence regression test that only covered the reverted behavior

What Changed

  • restore pipeline_wan2_2.py, pipeline_wan2_2_i2v.py, and pipeline_wan2_2_ti2v.py to the pre-#2878 prompt handling behavior
  • keep the later image preprocess / mask cleanup from 8a1ff4e9 in pipeline_wan2_2_i2v.py
  • remove the now-obsolete Wan2.2 max-sequence test added by #2878
  • keep the repo-wide formatting / unused-import cleanup required by pre-commit

Validation

  • python -m py_compile vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2.py vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_i2v.py vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py
  • pre-commit run --all-files
  • E2E serve + /v1/videos re-validation on 4x H20 with the same local environment style used in earlier #2878 comments

E2E Result

Fixed setup for both runs:

  • model snapshot: /mnt/data1/huggingface/hub/models--Wan-AI--Wan2.2-I2V-A14B-Diffusers/snapshots/596658fd9ca6b7b71d5057529bbf319ecbc61d74
  • CUDA_VISIBLE_DEVICES=4,5,6,7
  • --omni --enable-diffusion-pipeline-profiler --ulysses-degree 4
  • prompt: A white rabbit standing on a wooden table, then slowly turning its head and hopping forward with smooth motion.
  • size=1280x720, seconds=5, fps=16, num_frames=81, num_inference_steps=4, guidance_scale=3.5, guidance_scale_2=3.5, boundary_ratio=0.875, flow_shift=5.0, seed=42, frame interpolation disabled
  • fixed input image: /mnt/data4/cwq/tmp/rabbit_real.png

Measured comparison against current release/v0.18.0.post1 head:

  • release head: server_inference_time_s=113.73233714140952, artifact_ready_wall_s=114.78
  • this revert branch: server_inference_time_s=113.81139127537608, artifact_ready_wall_s=114.856
  • text_encoder.forward total from profiler log: 0.047962s -> 0.120787s
  • DiT step wall time from tqdm: 23.42s/it -> 23.50s/it

Interpretation:

  • reverting the Wan2.2 prompt-length backport brings text_encoder.forward back to the earlier higher-cost path
  • overall E2E latency stays nearly flat for this 4-step request, but the revert is slightly slower rather than faster
  • both outputs keep the same video metadata: 1280x720, 81 frames, 16 fps, 5.0625s

Signed-off-by: david6666666 <530634352@qq.com>
Copy link
Copy Markdown
Collaborator Author

david6666666 commented Apr 20, 2026

Supplementary E2E re-validation for this revert.

Setup stayed fixed across both runs:

  • model snapshot: /mnt/data1/huggingface/hub/models--Wan-AI--Wan2.2-I2V-A14B-Diffusers/snapshots/596658fd9ca6b7b71d5057529bbf319ecbc61d74
  • CUDA_VISIBLE_DEVICES=4,5,6,7
  • --omni --enable-diffusion-pipeline-profiler --ulysses-degree 4
  • prompt: A white rabbit standing on a wooden table, then slowly turning its head and hopping forward with smooth motion.
  • size=1280x720, seconds=5, fps=16, num_frames=81, num_inference_steps=4, guidance_scale=3.5, guidance_scale_2=3.5, boundary_ratio=0.875, flow_shift=5.0, seed=42, frame interpolation disabled
  • fixed input image: /mnt/data4/cwq/tmp/rabbit_real.png

Measured comparison vs current release/v0.18.0.post1 head:

  • release head: server_inference_time_s=113.73233714140952, artifact_ready_wall_s=114.78
  • revert branch: server_inference_time_s=113.81139127537608, artifact_ready_wall_s=114.856
  • text_encoder.forward total: 0.047962s -> 0.120787s
  • pipeline.forward: 111.590084s -> 111.948706s
  • DiT step tqdm wall time: 23.42s/it -> 23.50s/it
  • peak reserved GPU memory: 88.29GB -> 88.30GB

Conclusion:

  • this revert restores the earlier Wan2.2 prompt path, so text_encoder.forward regresses as expected
  • end-to-end latency for this short 4-step request stays nearly flat, but the revert is slightly slower rather than faster
  • both outputs still decode as 1280x720 / 81 frames / 16 fps

Additional local accuracy validation on worktree/issue2874-wan22-max-seq/tests/e2e/accuracy/wan22_i2v:

  • command:
    • cd /mnt/data4/cwq/worktree/issue2874-wan22-max-seq
    • PYTHONPATH=/mnt/data4/cwq/worktree/issue2874-wan22-max-seq /mnt/data4/cwq/.venv/bin/python -m pytest -q tests/e2e/accuracy/wan22_i2v -s
  • result: 16 passed in 3047.54s (0:50:47)
  • similarity metrics:
    • SSIM=0.967964 (threshold >= 0.94)
    • PSNR=37.894881 dB (threshold >= 28.0 dB)
  • online serving case:
    • usp=2, hsdp-shard-size=2
    • online_video_e2e_latency_s=833.868
  • artifact paths:
    • tests/e2e/accuracy/wan22_i2v/result/rabbit-cf925a4c/online.mp4
    • tests/e2e/accuracy/wan22_i2v/result/rabbit-cf925a4c/offline.mp4
    • tests/e2e/accuracy/wan22_i2v/result/rabbit-cf925a4c/offline_metadata.json
  • notes:
    • pytest completed successfully despite a non-fatal resource_tracker warning during server shutdown
    • the run also emitted a vLLM-Omni / vLLM major-minor mismatch warning, but the full accuracy suite still passed

@david6666666 david6666666 added the diffusion-x2v-test label to trigger buildkite x2video series of diffusion models test in nightly CI label Apr 20, 2026
@david6666666 david6666666 removed the diffusion-x2v-test label to trigger buildkite x2video series of diffusion models test in nightly CI label Apr 20, 2026
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

Ready for full review when draft status removed. Preliminary scan available on request.

@david6666666 david6666666 marked this pull request as ready for review April 20, 2026 11:08
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@david6666666
Copy link
Copy Markdown
Collaborator Author

image

@david6666666 david6666666 merged commit 89f733d into vllm-project:release/v0.18.0.post1 Apr 20, 2026
3 checks passed
Copy link
Copy Markdown
Collaborator

@gcanlin gcanlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants