Skip to content

[Revert] drop Wan2.2 prompt-length enforcement from #2847#2877

Merged
gcanlin merged 2 commits into
vllm-project:mainfrom
david6666666:codex/issue-2874-wan22-max-seq
Apr 20, 2026
Merged

[Revert] drop Wan2.2 prompt-length enforcement from #2847#2877
gcanlin merged 2 commits into
vllm-project:mainfrom
david6666666:codex/issue-2874-wan22-max-seq

Conversation

@david6666666
Copy link
Copy Markdown
Collaborator

@david6666666 david6666666 commented Apr 17, 2026

Summary

  • raise the default Wan2.2 max_sequence_length from 512 to 2048
  • apply the same default across Wan2.2 T2V, I2V, TI2V, and VACE pipelines
  • avoid padding short prompts and negative prompts to the configured ceiling during text encoding
  • add regression coverage for both prompt-length validation and actual-length text-encoder padding

Root cause

Issue #2874 is triggered because Wan2.2 pipelines defaulted max_sequence_length to 512, while the CI I2V prompt is 654 tokens long.

After raising the limit, encode_prompt() still used padding="max_length" with max_length=max_sequence_length, which padded short prompts to 2048 and introduced an unnecessary text-encoding slowdown for short-prompt requests.

Validation

  • python -m py_compile vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2.py vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_i2v.py vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py
  • python -m pytest --noconftest -q tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py
  • pre-commit run --all-files
  • Wan2.2 I2V E2E with the local snapshot at /mnt/data1/huggingface/hub/models--Wan-AI--Wan2.2-I2V-A14B-Diffusers/snapshots/596658fd9ca6b7b71d5057529bbf319ecbc61d74
    • short prompt request completed successfully
    • inference_time_s=114.31679305434227
    • profiler showed Wan22I2VPipeline.text_encoder.forward back at roughly 0.015s per short-prompt call
image

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@david6666666 david6666666 changed the title [Fix] align Wan2.2 max_sequence_length with model config [Fix] raise Wan2.2 max_sequence_length to 2048 Apr 17, 2026
@david6666666 david6666666 added the ready label to trigger buildkite CI label Apr 17, 2026
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

BLOCKING ISSUE: This PR changes the default max_sequence_length from 512 to 2048, which is a breaking change that affects all Wan2.2 users.

Issue #2874 is a CI test failure. The appropriate fix is to either:

  1. Shorten the CI test prompt to fit within 512 tokens, OR
  2. Add explicit configuration (max_sequence_length=2048) to the test setup

Changing the default value increases memory usage for ALL users just to fix a CI test. This is not the right approach.

Please reconsider the fix for #2874 without breaking existing user behavior.

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

Raising max_sequence_length to 2048 is a 4x increase. This fixes the immediate CI failure (#2874), but consider:

  • Does this match Wan2.2's actual trained prompt length limit? If the model was trained on 512-token prompts, going to 2048 could produce degraded quality or unexpected behavior beyond the training distribution.

  • Memory impact: longer prompts = more VRAM for text encoder outputs. For I2V workflows where VRAM is already tight, this could cause OOMs on smaller GPUs.

  • For future PRs: consider documenting the model's native token limit in model metadata rather than hardcoding in the pipeline. This makes it easier to adjust per-deployment without code changes.

@Gaohan123
Copy link
Copy Markdown
Collaborator

Gaohan123 commented Apr 17, 2026

Plan to temporarily skip it with #2883 after this one is updated

@david6666666 david6666666 changed the title [Fix] raise Wan2.2 max_sequence_length to 2048 [Fix] raise Wan2.2 max_sequence_length to 2048 without padding short prompts Apr 20, 2026
Copy link
Copy Markdown
Collaborator Author

Follow-up update for the short-prompt regression introduced after raising Wan2.2 max_sequence_length.

  • Added commit b665b917 ([Fix] avoid padding short Wan2.2 prompts to max_sequence_length)
  • Updated Wan2.2 T2V, I2V, and TI2V so they still validate against the configured max_sequence_length, but only pad text-encoder inputs to the actual max prompt length needed by the current batch
  • Added regression coverage in tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py

Validation run on this branch:

  • python -m py_compile ... passed
  • python -m pytest --noconftest -q tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py passed (16 passed)
  • pre-commit run --all-files passed
  • Wan2.2 I2V E2E with the short English prompt completed successfully
    • inference_time_s=114.31679305434227
    • profiler showed Wan22I2VPipeline.text_encoder.forward back at roughly 0.015s per short-prompt call

This keeps the larger prompt-length support for #2874 while removing the short-prompt text-encoding regression.

@david6666666 david6666666 force-pushed the codex/issue-2874-wan22-max-seq branch from 6006c9b to 9abff4f Compare April 20, 2026 02:40
david6666666 added a commit that referenced this pull request Apr 20, 2026
 #2877 (#2878)

Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: David Chen <530634352@qq.com>
Signed-off-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
@david6666666 david6666666 added the diffusion-x2v-test label to trigger buildkite x2video series of diffusion models test in nightly CI label Apr 20, 2026
@david6666666 david6666666 changed the title [Fix] raise Wan2.2 max_sequence_length to 2048 without padding short prompts [Bugfix] raise Wan2.2 max_sequence_length to 2048 without padding short prompts Apr 20, 2026
@david6666666
Copy link
Copy Markdown
Collaborator Author

@gcanlin @fan2956 wan2.2 i2v accuracy test is failed

@Gaohan123 Gaohan123 added this to the v0.20.0 milestone Apr 20, 2026
Signed-off-by: david6666666 <530634352@qq.com>
@david6666666 david6666666 force-pushed the codex/issue-2874-wan22-max-seq branch from b0972ef to 9528def Compare April 20, 2026 08:17
@david6666666 david6666666 changed the title [Bugfix] raise Wan2.2 max_sequence_length to 2048 without padding short prompts [Revert] drop Wan2.2 prompt-length enforcement from #2847 Apr 20, 2026
Copy link
Copy Markdown
Collaborator Author

Branch update:

  • Reset this PR branch onto the latest upstream/main
  • Reverted only the Wan2.2-related changes that came from #2847
  • Kept the shared vllm_omni/diffusion/utils/prompt_utils.py because it is now used by Qwen-Image paths as well

Current branch head:

  • 9528def2 [Revert] drop Wan2.2 prompt-length enforcement from #2847

Validation on this branch:

  • python -m py_compile for the reverted Wan2.2 pipeline files passed
  • python -m pytest --noconftest -q tests/diffusion/models/wan2_2/test_wan22_i2v_pipeline.py tests/diffusion/models/wan2_2/test_wan22_pipeline_diffuse.py tests/diffusion/models/wan2_2/test_wan22_pipeline_helpers.py tests/diffusion/models/wan2_2/test_wan22_ti2v_pipeline.py tests/diffusion/models/wan2_2/test_wan22_vace_pipeline.py
    • 16 passed
  • pre-commit run --all-files
    • Passed

So the branch now contains only the targeted Wan2.2 revert on top of the current main.

Signed-off-by: david6666666 <530634352@qq.com>
Copy link
Copy Markdown
Collaborator

@gcanlin gcanlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please also revert 5be6ff5#diff-f53575bf88041d823a9739163c467042dace2b532fc93a825f9f3e89fc169315 this change in v0.18.0.post1.

@gcanlin gcanlin merged commit c2859e9 into vllm-project:main Apr 20, 2026
8 checks passed
nainiu258 pushed a commit to nainiu258/vllm-omni that referenced this pull request Apr 21, 2026
…vllm-project#2877)

Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: nainiu258 <cperfect02@163.com>
qinganrice pushed a commit to qinganrice/vllm-omni that referenced this pull request Apr 23, 2026
lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request May 1, 2026
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion-x2v-test label to trigger buildkite x2video series of diffusion models test in nightly CI ready label to trigger buildkite CI

Projects

None yet

4 participants