[cherry-pick][release/v0.18.0.post1] cherry-pick #2847 #2780 #2840 #2876 #2877 by david6666666 · Pull Request #2878 · vllm-project/vllm-omni

david6666666 · 2026-04-17T09:00:54Z

Summary

cherry-pick [Bugfix] enforce max_sequence_length for Qwen-Image and Wan2.2 series before encoding #2847, [Bugfix] Preserve default diffusion sampling params in default stage #2780, [Bugfix] Limit Qwen-Image-Edit-2511 input image count #2840, and [Bugfix] Fix RIFE device selection for CPU-transported videos #2876 onto release/v0.18.0.post1 in that order
resolve the Wan2.2 backport drift in [Bugfix] enforce max_sequence_length for Qwen-Image and Wan2.2 series before encoding #2847 against the release branch by keeping the prompt-length fixes for the Wan2.2 pipelines that exist on this branch
include the repo-required pre-commit formatting cleanup from running pre-commit run --all-files

Validation

python -m py_compile vllm_omni/engine/async_omni_engine.py tests/entrypoints/test_async_omni_diffusion_config.py tests/entrypoints/openai_api/test_image_server.py
python -m pytest -q tests/diffusion/models/qwen_image/test_qwen_image_max_sequence_length.py tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py
python -m pytest -q tests/diffusion/models/qwen_image/test_qwen_image_edit_plus.py
python -m pytest -q tests/entrypoints/openai_api/test_video_api_utils.py
python -m pytest -q tests/entrypoints/test_async_omni_diffusion_config.py tests/entrypoints/openai_api/test_image_server.py
pre-commit run --all-files
E2E validation for the cherry-picked image/video paths; detailed request/response evidence is posted in the PR comments

Notes

The original 2876 long Chinese prompt now exceeds Wan2.2's enforced max_sequence_length=512 after cherry-picking [Bugfix] enforce max_sequence_length for Qwen-Image and Wan2.2 series before encoding #2847 into the same branch. I kept the 2876 serve command and runtime settings unchanged, confirmed that failure on the combined branch, then reran the same 2876 validation flow with a shorter prompt to verify RIFE still loads on CUDA.

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit adda9a6)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 281e14a)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 66151f0)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 1e8fa70)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 0a6d618)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit bd9bfaf)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 21851d6)

Signed-off-by: David Chen <530634352@qq.com> (cherry picked from commit 896b0b8)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 0e2f009)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit eec0785)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 72af603)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit f1900fe)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit c95d20c)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 731c536)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 8c857c3)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 0c25a06)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 826c74a)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 297d06b)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 4ea2271)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit f414061)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 05a7a5d)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit f3e7ce9)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 3015646)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit ecbb6d4)

Signed-off-by: david6666666 <530634352@qq.com>

chatgpt-codex-connector · 2026-04-17T09:01:00Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

david6666666 · 2026-04-17T09:01:51Z

Supplemental validation for the ordered backport onto release/v0.18.0.post1.

Cherry-pick order used:

#2847
#2780
#2840
#2876

Local test environment note:

PYTHONPATH=/mnt/data4/yipeng/vllm was used so the local vllm API matches this release branch during pytest and serving.

Static / unit validation:

python -m py_compile vllm_omni/engine/async_omni_engine.py tests/entrypoints/test_async_omni_diffusion_config.py tests/entrypoints/openai_api/test_image_server.py
- Passed.
python -m pytest -q tests/diffusion/models/qwen_image/test_qwen_image_max_sequence_length.py tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py
- 28 passed.
python -m pytest -q tests/diffusion/models/qwen_image/test_qwen_image_edit_plus.py
- 1 passed.
python -m pytest -q tests/entrypoints/openai_api/test_video_api_utils.py
- 6 passed.
python -m pytest -q tests/entrypoints/test_async_omni_diffusion_config.py tests/entrypoints/openai_api/test_image_server.py
- 52 passed.
pre-commit run --all-files
- Passed after hook-applied formatting cleanup.
- Formatting-only follow-up commit on this branch: beeb333a ([Chore] run pre-commit formatting).

E2E for #2847:

Qwen-Image serve on local snapshot /mnt/data1/huggingface/hub/models--Qwen--Qwen-Image/snapshots/75e0b4be04f60ec59a75f475837eced720f823b6
- short prompt request to /v1/images/generations: 200, decoded output size 512x512, decoded PNG bytes 842.
- long prompt request (5000 x rabbit): 500 with message:
  - `prompt` is too long after applying the Qwen prompt template: got 5000 tokens, but `max_sequence_length` is 1024
Qwen-Image-Edit serve on local snapshot /mnt/data1/huggingface/hub/models--Qwen--Qwen-Image-Edit/snapshots/ac7f9318f633fc4b5778c59367c8128225f1e3de
- short prompt request to /v1/images/edits: 200, decoded output size 512x512, decoded PNG bytes 787271.
- long prompt request (5000 x rabbit): 500 with message:
  - `prompt` is too long after applying the Qwen prompt template: got 5000 tokens, but `max_sequence_length` is 1024
Wan2.2-T2V-A14B-Diffusers serve on local snapshot /mnt/data1/huggingface/hub/models--Wan-AI--Wan2.2-T2V-A14B-Diffusers/snapshots/5be7df9619b54f4e2667b2755bc6a756675b5cd7
- short /v1/videos request (num_inference_steps=1, num_frames=5): create 200, final status completed, inference_time_s=0.43019302003085613, output bytes 36362.
- long prompt request (5000 x rabbit): create 200, final status failed, error includes:
  - `prompt` is too long for Wan2.2 text encoding: got 10001 tokens, but `max_sequence_length` is 512

E2E for #2840:

Qwen-Image-Edit-2511 serve on local snapshot /mnt/data1/huggingface/hub/models--Qwen--Qwen-Image-Edit-2511/snapshots/6f3ccc0b56e431dc6a0c2b2039706d7d26f22cb9
/v1/images/edits with 5 input images:
- 400 with message:
  - Received 5 input images. At most 4 images are supported by this model.
/v1/images/edits with 4 input images:
- 200, decoded output size 512x512, mode RGB.

E2E for #2876:

Serve command used for the main validation path kept the requested runtime settings:
- CUDA_VISIBLE_DEVICES=4,5,6,7
- model /mnt/data1/huggingface/hub/models--Wan-AI--Wan2.2-I2V-A14B-Diffusers/snapshots/596658fd9ca6b7b71d5057529bbf319ecbc61d74
- --omni --port 8099 --enable-diffusion-pipeline-profiler --ulysses-degree 4
First run with the exact provided long Chinese prompt:
- create 200, final status failed
- failure is expected on the combined backport branch because #2847 now enforces Wan2.2 prompt length before encoding
- final error includes:
  - `prompt` is too long for Wan2.2 text encoding: got 654 tokens, but `max_sequence_length` is 512
Second run to isolate and verify the #2876 RIFE device-selection behavior used the same serve command, same image, same interpolation settings, and a shortened prompt that stays within the new Wan2.2 limit:
- final status completed
- artifact_ready_wall_s=210.576
- server_inference_time_s=209.5431856457144
- output file bytes 1092405
Relevant server log evidence from the successful rerun:
- Loaded RIFE weights from /mnt/data1/huggingface/hub/models--elfgum--RIFE-4.22.lite/snapshots/99d6892a9f4c039cb37ff21c9530e79b13f0b30b/flownet.pkl
- RIFE model loaded on device: cuda
- GET /v1/videos/video_gen_6650cb9180b14ff68ac85d4d79e87039/content HTTP/1.1 200 OK

Backport note for #2847 on this release branch:

release/v0.18.0.post1 does not contain the Wan2.2 VACE implementation from main, so the backport keeps the prompt-length validation for the Wan2.2 pipelines that exist on this branch (T2V, I2V, TI2V) and drops the VACE-only touchpoints.

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 072bfa2)

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit ea6ce23)

Signed-off-by: david6666666 <530634352@qq.com>

david6666666 · 2026-04-17T09:19:02Z

Update: I cherry-picked #2877 onto this backport branch as well and re-validated the previously blocked #2876 case.

Additional commits on this branch:

d7233cbd [Fix] align Wan2.2 max_sequence_length with model config
25ac7cd8 [Fix] raise Wan2.2 max_sequence_length to 2048
67e52e86 [Chore] run pre-commit after PR2877 backport

Additional validation after adding #2877:

python -m compileall vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2.py vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_i2v.py vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py
- Passed.
python -m pytest -q tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py tests/entrypoints/openai_api/test_video_api_utils.py
- 15 passed.
pre-commit run --all-files
- Passed.

Re-validation of the original #2876 I2V + RIFE case using the exact long Chinese prompt from the earlier run:

Serve command remained the same:
- CUDA_VISIBLE_DEVICES=4,5,6,7
- model /mnt/data1/huggingface/hub/models--Wan-AI--Wan2.2-I2V-A14B-Diffusers/snapshots/596658fd9ca6b7b71d5057529bbf319ecbc61d74
- --omni --port 8099 --enable-diffusion-pipeline-profiler --ulysses-degree 4
Request parameters remained the same as the previously supplied #2876 validation script, including:
- size=1280x720
- seconds=5
- fps=16
- num_inference_steps=8
- guidance_scale=3.5
- guidance_scale_2=3.5
- boundary_ratio=0.875
- num_frames=81
- flow_shift=5.0
- seed=42
- enable_frame_interpolation=true
- frame_interpolation_exp=1
- frame_interpolation_scale=1.0
- frame_interpolation_model_path=/mnt/data1/huggingface/hub/models--elfgum--RIFE-4.22.lite/snapshots/99d6892a9f4c039cb37ff21c9530e79b13f0b30b
Result with the original long prompt after adding #2877:
- final_status=completed
- artifact_ready_wall_s=215.108
- server_inference_time_s=214.06226211227477
- output file bytes 970842
- video id video_gen_daf89c9c7953414387cfb486bacc5122

Relevant server log evidence from this exact rerun:

Loaded RIFE weights from /mnt/data1/huggingface/hub/models--elfgum--RIFE-4.22.lite/snapshots/99d6892a9f4c039cb37ff21c9530e79b13f0b30b/flownet.pkl
RIFE model loaded on device: cuda
GET /v1/videos/video_gen_daf89c9c7953414387cfb486bacc5122/content HTTP/1.1 200 OK

So after adding #2877, the exact long-prompt #2876 validation path that previously failed at Wan2.2 prompt-length validation now completes successfully, while still keeping the RIFE device-selection fix validated on CUDA.

hsliuustc0106 · 2026-04-17T09:27:13Z

BLOCKING ISSUE: This PR cherry-picks unmerged PRs (#2840, #2876) from main to the release branch.

Release branches should only receive changes that have been proven on main. Cherry-picking open PRs bypasses the normal review process and can introduce unverified code.

Please wait until #2840 and #2876 are reviewed and merged to main, then cherry-pick from there.

hsliuustc0106 · 2026-04-17T09:34:50Z

Cherry-pick validation looks comprehensive.

One concern: cherry-picking multiple PRs together can make conflict resolution fragile. When this lands, verify the backport doesn't create divergence issues with main branch behavior - especially the Wan2.2 max_sequence_length changes (#2847 + #2877 interaction) which were called out in the notes.

Suggestion for future release branch work: Consider landing PRs individually when possible to reduce merge conflict surface area.

gcanlin · 2026-04-17T13:43:28Z

UT is broken in v0.18.0.post1. Considering the quality of release, would be better to fix them.

This reverts commit 5041f7e.

Signed-off-by: david6666666 <530634352@qq.com>

This reverts commit f0c4c1f.

Signed-off-by: david6666666 <530634352@qq.com>

FrosterHan · 2026-04-18T01:53:35Z

2847 2840 passed verification

Signed-off-by: david6666666 <530634352@qq.com>

david6666666 · 2026-04-19T11:59:42Z

Follow-up for the Wan2.2 short-prompt performance regression observed on this backport branch.

Root cause

After the Wan2.2 max_sequence_length backport, the runtime correctly allowed prompts up to 2048, but encode_prompt() still used padding="max_length" for the text encoder path.
That meant short prompts were still encoded at the full configured max_sequence_length, so the extra latency showed up in text_encoder.forward, not in DiT denoising.
The regression was specific to Wan2.2 because Qwen-Image uses actual-length padding / processor inputs rather than padding short prompts to the configured ceiling.

Fix

Commit: 5be6ff56 ([Fix] avoid padding short Wan2.2 prompts to max_sequence_length)
Updated Wan2.2 T2V, I2V, and TI2V so they:
1. keep validating prompt / negative prompt length against max_sequence_length
2. compute the actual max prompt length needed by the current batch
3. only pad the text-encoder inputs to that actual batch max instead of the configured ceiling
This keeps the 2048-token support from the earlier backport while removing the short-prompt text-encoding slowdown.

Added regression test

tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py
New coverage asserts that short prompts are encoded at their actual length rather than being padded to the supported max length.

Validation

python -m pytest -q tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py
- 12 passed
python -m py_compile vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2.py vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_i2v.py vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py tests/diffusion/models/wan2_2/test_wan22_max_sequence_length.py
- Passed.
pre-commit run --all-files
- Passed.

E2E re-validation (same local environment style as the earlier PR comments)

Serve:
- local snapshot /mnt/data1/huggingface/hub/models--Wan-AI--Wan2.2-I2V-A14B-Diffusers/snapshots/596658fd9ca6b7b71d5057529bbf319ecbc61d74
- PYTHONPATH=/mnt/data4/yipeng/vllm:<worktree>
- CUDA_VISIBLE_DEVICES=4,5,6,7
- --omni --enable-diffusion-pipeline-profiler --ulysses-degree 4
Request:
- /v1/videos
- prompt: short English prompt (A white rabbit standing on a wooden table, then slowly turning its head and hopping forward with smooth motion.)
- size=1280x720
- seconds=5
- fps=16
- num_frames=81
- num_inference_steps=4
- guidance_scale=3.5
- guidance_scale_2=3.5
- boundary_ratio=0.875
- flow_shift=5.0
- seed=42
- frame interpolation disabled

Measured result for the fixed branch

final_status=completed
inference_time_s=113.2784127406776
output file bytes 1300357

Comparison against the earlier measurements collected on April 19, 2026

baseline release/v0.18.0.post1: 113.77747260034084 s
this PR before the fix: 116.16866869293153 s
this PR after the fix: 113.2784127406776 s

Profiler evidence from the fixed run

Wan22I2VPipeline.text_encoder.forward returned to roughly 0.014s - 0.018s per call for the measured request
Wan22I2VPipeline.forward on the measured request: 111.188085s
DiffusionEngine.step breakdown: preprocess=16.90 ms, add_req_and_wait=111750.35 ms, postprocess=236.55 ms, total=112004.36 ms

Conclusion

The observed regression on short prompts was in Wan2.2 text encoding, not in DiT denoising.
After this fix, the measured Wan2.2 I2V runtime is back in line with the release/v0.18.0.post1 baseline while preserving the larger prompt-length support from the backported validation work.

…-pr2847-2780-2840-2876 Signed-off-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>

(#2937) Signed-off-by: david6666666 <530634352@qq.com>

david6666666 added 25 commits April 17, 2026 08:32

[Fix] enforce Qwen-Image max_sequence_length before encoding

1486f39

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit adda9a6)

[Fix] enforce Wan2.2 max_sequence_length before encoding

efde282

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 281e14a)

[Fix] apply pre-commit formatting

cae196a

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 66151f0)

[Refactor] merge diffusion prompt length validators

0ec989b

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 1e8fa70)

[Fix] exclude Qwen template overhead from prompt length checks

973adde

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 0a6d618)

[Docs] annotate Qwen prompt length validation invariants

1f190d9

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit bd9bfaf)

[Test] add pytest marks for max sequence UTs

7eeafcf

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 21851d6)

[Bugfix] Preserve default diffusion sampling params in default stage

b63a3b7

Signed-off-by: David Chen <530634352@qq.com> (cherry picked from commit 896b0b8)

Fix Qwen image edit plus prompt encoding memory

ee577d7

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 0e2f009)

Clamp Qwen image edit plus prompt length

4abda0c

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit eec0785)

Limit Qwen image edit plus input images

d8f0ff9

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 72af603)

Translate omni image edit input errors

2596b22

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit f1900fe)

Validate image edit limits in API layer

a1b3e50

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit c95d20c)

Refine image edit input limit helpers

a8763b3

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 731c536)

Simplify image edit input limit helper

09d8847

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 8c857c3)

Wrap image edit limit error message

9acab70

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 0c25a06)

Use model-specific image edit limits

518640d

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 826c74a)

Reject over-limit image edits before loading

4814a9a

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 297d06b)

Document image edit limit handling

854fad6

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 4ea2271)

[Fix] Make image edit input limits config-driven

1a9dbd3

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit f414061)

[Refactor] Move diffusion image limits into shared metadata

c2a0230

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 05a7a5d)

[Fix] Apply pre-commit import ordering

125dc9d

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit f3e7ce9)

[Docs] Clarify shared diffusion image limit metadata

13ca70f

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 3015646)

Fix RIFE device selection for CPU transport

bbaf682

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit ecbb6d4)

[Chore] run pre-commit formatting

beeb333

Signed-off-by: david6666666 <530634352@qq.com>

david6666666 requested a review from hsliuustc0106 as a code owner April 17, 2026 09:00

david6666666 changed the title ~~[Backport][release/v0.18.0.post1] cherry-pick #2847 #2780 #2840 #2876~~ [cherry-pick][release/v0.18.0.post1] cherry-pick #2847 #2780 #2840 #2876 Apr 17, 2026

david6666666 changed the title ~~[cherry-pick][release/v0.18.0.post1] cherry-pick #2847 #2780 #2840 #2876~~ [cherry-pick][release/v0.18.0.post1] cherry-pick #2847 #2780 #2840 #2876 #2877 Apr 17, 2026

david6666666 added 2 commits April 17, 2026 09:09

[Fix] align Wan2.2 max_sequence_length with model config

d7233cb

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit 072bfa2)

[Fix] raise Wan2.2 max_sequence_length to 2048

25ac7cd

Signed-off-by: david6666666 <530634352@qq.com> (cherry picked from commit ea6ce23)

david6666666 mentioned this pull request Apr 17, 2026

[RFC][0.20.0]: Qwen-Image、Qwen-Image-Layered、Qwen-Image-Edit-Plus、Wan2.2 Production-grade Feature Monitoring JiusiServe/vllm-omni#181

Closed

1 task

[Chore] run pre-commit after PR2877 backport

67e52e8

Signed-off-by: david6666666 <530634352@qq.com>

gcanlin added the ready label to trigger buildkite CI label Apr 17, 2026

gcanlin approved these changes Apr 17, 2026

View reviewed changes

david6666666 added 5 commits April 17, 2026 13:52

Fix async video job profiler metadata persistence

5041f7e

Revert "Fix async video job profiler metadata persistence"

6e87f20

This reverts commit 5041f7e.

Fix release backport unit regressions

f0c4c1f

Signed-off-by: david6666666 <530634352@qq.com>

Revert "Fix release backport unit regressions"

210bb37

This reverts commit f0c4c1f.

Fix 400 status for prompt length validation

c3540ee

Signed-off-by: david6666666 <530634352@qq.com>

[Fix] avoid padding short Wan2.2 prompts to max_sequence_length

5be6ff5

Signed-off-by: david6666666 <530634352@qq.com>

Merge branch 'release/v0.18.0.post1' into codex/release-v0.18.0.post1…

7930665

…-pr2847-2780-2840-2876 Signed-off-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>

david6666666 merged commit 2116e88 into vllm-project:release/v0.18.0.post1 Apr 20, 2026
2 of 5 checks passed

david6666666 added a commit that referenced this pull request Apr 20, 2026

[codex][release/v0.18.0.post1] revert Wan2.2 pipeline changes from #2878

89f733d

(#2937) Signed-off-by: david6666666 <530634352@qq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cherry-pick][release/v0.18.0.post1] cherry-pick #2847 #2780 #2840 #2876 #2877#2878

[cherry-pick][release/v0.18.0.post1] cherry-pick #2847 #2780 #2840 #2876 #2877#2878
david6666666 merged 35 commits into
vllm-project:release/v0.18.0.post1from
david6666666:codex/release-v0.18.0.post1-pr2847-2780-2840-2876

david6666666 commented Apr 17, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 17, 2026

Uh oh!

david6666666 commented Apr 17, 2026

Uh oh!

david6666666 commented Apr 17, 2026

Uh oh!

hsliuustc0106 commented Apr 17, 2026

Uh oh!

hsliuustc0106 commented Apr 17, 2026

Uh oh!

gcanlin commented Apr 17, 2026

Uh oh!

FrosterHan commented Apr 18, 2026

Uh oh!

david6666666 commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

david6666666 commented Apr 17, 2026

Summary

Validation

Notes

Uh oh!

chatgpt-codex-connector Bot commented Apr 17, 2026

Uh oh!

david6666666 commented Apr 17, 2026

Uh oh!

david6666666 commented Apr 17, 2026

Uh oh!

hsliuustc0106 commented Apr 17, 2026

Uh oh!

hsliuustc0106 commented Apr 17, 2026

Uh oh!

gcanlin commented Apr 17, 2026

Uh oh!

FrosterHan commented Apr 18, 2026

Uh oh!

david6666666 commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants