[Bugfix] Sync main into dev/migrate-MR-v2 with semantic-safe conflict resolution by Sy0307 · Pull Request #2954 · vllm-project/vllm-omni

Sy0307 · 2026-04-20T13:28:14Z

Background

dev/migrate-MR-v2 had drifted significantly behind main, and a direct merge now carries conflicts across three layers at once:

test infrastructure / helper layout refactor from main ([CI] Restructure vLLM-Omni Test Layout, Fixture Scope, and Support Modules #2620)
deploy/schema refactor from main ([Config Refactor][2/N] Pipeline + Deploy Config Schema #2383, [Config Refactor 2.5/N] Centralize pipeline registry #2915)
MR-V2-specific scheduler/runtime behavior on dev/migrate-MR-v2

This PR performs the sync with a conservative rule: preserve main's structure and schema wherever possible, and only carry over the minimum dev semantics required for MR-V2 / Qwen3-TTS / Qwen3-Omni behavior.

Changes

1. Merge `origin/main` into `dev/migrate-MR-v2`

Resolved the merge with the following policy:

tests / helpers / conftest / docs / dfx: follow main
deploy yaml: follow main's new deploy schema, keep only required dev runtime semantics
scheduler: keep dev's MR-V2-critical behavior without regressing main's structure

2. Keep `main` test infrastructure, adapt `dev`-only MR-V2 test paths

Accepted main's thin tests/conftest.py
Accepted deletion of tests/utils.py and moved usage to tests/helpers/*
Fixed tests/examples/offline_inference/test_qwen3_tts_mr_v2.py imports:
- tests.examples.conftest → tests.examples.helpers
- tests.utils → tests.helpers.mark

3. Merge deploy yaml semantically, not textually

For:

vllm_omni/deploy/qwen3_tts.yaml
vllm_omni/deploy/qwen3_omni_moe.yaml

we kept main's deploy/schema layout and only preserved the dev settings that actually affect runtime behavior.

After checking _build_extras() merge order in vllm_omni/config/stage_config.py, we removed the deploy-side sampling params that were actually no-ops because pipeline constraints overwrite them:

removed noop detokenize
removed noop talker-side stop_token_ids: [2150]
kept only the effective code2wav-side stop_token_ids: [0]

4. Fix scheduler cleanup path on the merged branch

Adjusted vllm_omni/core/sched/omni_generation_scheduler.py so the already-finished path no longer routes an already-finished request through finish_requests() (which becomes a no-op upstream).

Current behavior:

request_id not in self.requests:
- remove from running
- propagate worker-side finished signal
RequestStatus.FINISHED_STOPPED without chunk adapter:
- enqueue into the same already_finished_reqs path
- remove from running
- call _free_request() only when scheduler-side state still exists

This avoids both:

worker-side No free indices
scheduler-side dead-loop / stale-state risk in the non-async-chunk path

5. Clean minor correctness / merge fallout

fixed misleading env comment in omni_ar_scheduler.py
fixed leftover deserialize_additional_information reference to _resolve_additional_information
kept top-level spacing / lint clean in touched scheduler files

Risks & Considerations

This is still a large integration PR because it syncs main into a long-lived migration branch.

Main risks:

hidden behavioral differences in test helper restructuring (tests/helpers/*)
deploy/schema drift between old MR-V2 assumptions and the new deploy/*.yaml system
scheduler cleanup / connector interaction under long-running or rare paths

The mitigations in this PR are:

keep main's structure wherever possible
minimize dev carry-over to the smallest required behavior delta
validate both async-chunk and no-async-chunk TTS service paths remotely

Test Plan

Static / local

resolved all git merge conflicts on fix/dev-sync-main-semantic-safe
ruff check passed for touched scheduler / MR-V2 example files

Remote service validation (`/chrome-remote-gpu`)

merged-branch Qwen3-TTS 0.6B online serving startup
merged-branch Qwen3-TTS 0.6B /v1/audio/speech — 5 concurrent requests, all succeeded
merged-branch Qwen3-TTS 0.6B stress — 3 × 10 concurrent requests, all succeeded
merged-branch Qwen3-TTS 0.6B --no-async-chunk path — 5 concurrent requests, all succeeded
merged-branch MR-V2 example path: examples/offline_inference/qwen3_tts/end2end.py --query-type Base --mode-tag icl returned RC=0
merged-branch Qwen3-Omni 30B startup smoke reached stage init / graph capture without immediate schema/config crash

Known limitation

full pytest validation on the remote machine is currently blocked by the remote environment's torch / vllm / tests.helpers.fixtures.runtime incompatibility (mm_configs import failure in torch inductor plugin loading)
final confidence still requires CI on this branch

Notes

This PR intentionally treats the merge as a semantic integration rather than a mechanical text merge. The goal is to keep main's latest structure intact while preserving the MR-V2 runtime semantics that the migration branch still needs.

cc @tzhouam @Fattysand

…_generates_video[wan22_i2v_usp2_hsdp2] (vllm-project#2883) Signed-off-by: wangyu <410167048@qq.com>

Signed-off-by: Lancer <maruixiang6688@gmail.com>

…t#2343) Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

…ures (vllm-project#1837) Signed-off-by: CHEN <116010019@link.cuhk.edu.cn> Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com> Signed-off-by: linyueqian <linyueqian@outlook.com> Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com> Co-authored-by: linyueqian <linyueqian@outlook.com>

Signed-off-by: Joshna Medisetty <joshna.medisetty@intel.com> Signed-off-by: Joshna-Medisetty <joshna.medisetty@intel.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

Signed-off-by: Alex Brooks <albrooks@redhat.com>

Signed-off-by: hsliuustc0106 <liuhongsheng4@huawei.com> Signed-off-by: hsliu <liuhongsheng4@huawei.com> Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Signed-off-by: david6666666 <david6666666@users.noreply.github.com> Co-authored-by: david6666666 <david6666666@users.noreply.github.com>

Signed-off-by: Nick Cao <ncao@redhat.com>

…t#2581) Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Samit <285365963@qq.com>

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

…pt (vllm-project#2894) Signed-off-by: Sy03 <1370724210@qq.com>

…2383) Signed-off-by: lishunyang <lishunyang12@163.com> Signed-off-by: reidliu41 <reid201711@gmail.com> Signed-off-by: Alex Brooks <albrooks@redhat.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Co-authored-by: xiaohajiayou <75477391+xiaohajiayou@users.noreply.github.com> Co-authored-by: Alex Brooks <albrooks@redhat.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

…+decode batches (vllm-project#2903) Signed-off-by: Sy03 <1370724210@qq.com>

…memory (vllm-project#2474) Signed-off-by: willamhou <willamhou@ceresman.com> Co-authored-by: willamhou <willamhou@ceresman.com>

Signed-off-by: xiaohajiayou <923390377@qq.com> Signed-off-by: Samit <285365963@qq.com> Co-authored-by: Samit <285365963@qq.com> Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>

…m-project#2018) Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com> Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com> Co-authored-by: Didan Deng <33117903+wtomin@users.noreply.github.com>

Signed-off-by: lishunyang <lishunyang12@163.com>

…2852) Signed-off-by: fan2956 <zhoufan53@huawei.com>

Signed-off-by: Rein Yang <ruiruyang2@gmail.com>

…m-project#2934) Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

…dules (vllm-project#2620) Signed-off-by: wangyu <410167048@qq.com>

…resolution Signed-off-by: Sy03 <1370724210@qq.com>

Signed-off-by: Sy03 <1370724210@qq.com>

chatgpt-codex-connector · 2026-04-20T13:28:26Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

yenuo26 and others added 28 commits April 17, 2026 23:10

[CI] Skip test_bagel[parallel_tp_2] and test_wan22_i2v_online_serving…

b4add5b

…_generates_video[wan22_i2v_usp2_hsdp2] (vllm-project#2883) Signed-off-by: wangyu <410167048@qq.com>

[Bugfix] fix CI failure (vllm-project#2884)

64d368d

Signed-off-by: Lancer <maruixiang6688@gmail.com>

[Cleanup] Remove dead runtime.defaults config parameters (vllm-projec…

f2edb81

…t#2343) Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

Nextstep online e2e (vllm-project#2107)

b5ddff7

Signed-off-by: Joshna Medisetty <joshna.medisetty@intel.com> Signed-off-by: Joshna-Medisetty <joshna.medisetty@intel.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

Add Teacache Support for LongCat Image (vllm-project#1487)

f346f2f

Signed-off-by: Alex Brooks <albrooks@redhat.com>

[Docs] Update WeChat QR code for community support (vllm-project#2895)

4f71f73

Signed-off-by: david6666666 <david6666666@users.noreply.github.com> Co-authored-by: david6666666 <david6666666@users.noreply.github.com>

[Refactor] Remove resampy dependency (vllm-project#2891)

d2c23d7

Signed-off-by: Nick Cao <ncao@redhat.com>

[Feature]Support audio streaming input and output-phase2 (vllm-projec…

4124a1f

…t#2581) Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

[BugFix]: Fix multi-stage cfg bug (vllm-project#2801)

768931e

Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

[doc][skip ci] remove redundant content in readme (vllm-project#2901)

fe6cec6

Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

[Feat] cache-dit for GLM-Image (vllm-project#1399)

9cf1fe7

Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Samit <285365963@qq.com>

[Agent] Add NPU main2main skill (vllm-project#2858)

9313f37

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

[Bugfix][VoxCPM2] Fix voice-clone decode loop by padding prefill prom…

a683b1d

…pt (vllm-project#2894) Signed-off-by: Sy03 <1370724210@qq.com>

[Bugfix][VoxCPM2]: Fix vectorized_gather OOB under concurrent prefill…

26edc7f

…+decode batches (vllm-project#2903) Signed-off-by: Sy03 <1370724210@qq.com>

perf(helios): replace strided RoPE with stack+flatten for contiguous …

1568451

…memory (vllm-project#2474) Signed-off-by: willamhou <willamhou@ceresman.com> Co-authored-by: willamhou <willamhou@ceresman.com>

[Config Refactor 2.5/N] Centralize pipeline registry (vllm-project#2915)

cd384d9

Signed-off-by: lishunyang <lishunyang12@163.com>

[Perf] Optimize Wan2.2 device free on image preprocess (vllm-project#…

78f237e

…2852) Signed-off-by: fan2956 <zhoufan53@huawei.com>

[Docs] update documents (vllm-project#2921)

d435fe0

Signed-off-by: Rein Yang <ruiruyang2@gmail.com>

[BugFix] Fixed the issue where --no-async-chunk was not working. (vll…

0393c58

…m-project#2934) Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

[CI] Restructure vLLM-Omni Test Layout, Fixture Scope, and Support Mo…

8a9add1

…dules (vllm-project#2620) Signed-off-by: wangyu <410167048@qq.com>

Merge origin/main into dev/migrate-MR-v2 with semantic-safe conflict …

2d7a64e

…resolution Signed-off-by: Sy03 <1370724210@qq.com>

Fix merge review issues on semantic-safe sync branch

fd91ad9

Signed-off-by: Sy03 <1370724210@qq.com>

Fix scheduler finished cleanup on semantic-safe sync branch

013005a

Signed-off-by: Sy03 <1370724210@qq.com>

Sy0307 marked this pull request as ready for review April 20, 2026 13:28

Sy0307 requested a review from hsliuustc0106 as a code owner April 20, 2026 13:28

tzhouam merged commit aeb22fe into vllm-project:dev/migrate-MR-v2 Apr 20, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Sync main into dev/migrate-MR-v2 with semantic-safe conflict resolution#2954

[Bugfix] Sync main into dev/migrate-MR-v2 with semantic-safe conflict resolution#2954
tzhouam merged 28 commits intovllm-project:dev/migrate-MR-v2from
Sy0307:fix/dev-sync-main-semantic-safe

Sy0307 commented Apr 20, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants

Conversation

Sy0307 commented Apr 20, 2026

Background

Changes

1. Merge origin/main into dev/migrate-MR-v2

2. Keep main test infrastructure, adapt dev-only MR-V2 test paths

3. Merge deploy yaml semantically, not textually

4. Fix scheduler cleanup path on the merged branch

5. Clean minor correctness / merge fallout

Risks & Considerations

Test Plan

Static / local

Remote service validation (/chrome-remote-gpu)

Known limitation

Notes

Uh oh!

chatgpt-codex-connector Bot commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants

1. Merge `origin/main` into `dev/migrate-MR-v2`

2. Keep `main` test infrastructure, adapt `dev`-only MR-V2 test paths

Remote service validation (`/chrome-remote-gpu`)