Skip to content

[Bugfix] Sync main into dev/migrate-MR-v2 with semantic-safe conflict resolution#2954

Merged
tzhouam merged 28 commits intovllm-project:dev/migrate-MR-v2from
Sy0307:fix/dev-sync-main-semantic-safe
Apr 20, 2026
Merged

[Bugfix] Sync main into dev/migrate-MR-v2 with semantic-safe conflict resolution#2954
tzhouam merged 28 commits intovllm-project:dev/migrate-MR-v2from
Sy0307:fix/dev-sync-main-semantic-safe

Conversation

@Sy0307
Copy link
Copy Markdown
Contributor

@Sy0307 Sy0307 commented Apr 20, 2026

Background

dev/migrate-MR-v2 had drifted significantly behind main, and a direct merge now carries conflicts across three layers at once:

This PR performs the sync with a conservative rule: preserve main's structure and schema wherever possible, and only carry over the minimum dev semantics required for MR-V2 / Qwen3-TTS / Qwen3-Omni behavior.

Changes

1. Merge origin/main into dev/migrate-MR-v2

Resolved the merge with the following policy:

  • tests / helpers / conftest / docs / dfx: follow main
  • deploy yaml: follow main's new deploy schema, keep only required dev runtime semantics
  • scheduler: keep dev's MR-V2-critical behavior without regressing main's structure

2. Keep main test infrastructure, adapt dev-only MR-V2 test paths

  • Accepted main's thin tests/conftest.py
  • Accepted deletion of tests/utils.py and moved usage to tests/helpers/*
  • Fixed tests/examples/offline_inference/test_qwen3_tts_mr_v2.py imports:
    • tests.examples.conftesttests.examples.helpers
    • tests.utilstests.helpers.mark

3. Merge deploy yaml semantically, not textually

For:

  • vllm_omni/deploy/qwen3_tts.yaml
  • vllm_omni/deploy/qwen3_omni_moe.yaml

we kept main's deploy/schema layout and only preserved the dev settings that actually affect runtime behavior.

After checking _build_extras() merge order in vllm_omni/config/stage_config.py, we removed the deploy-side sampling params that were actually no-ops because pipeline constraints overwrite them:

  • removed noop detokenize
  • removed noop talker-side stop_token_ids: [2150]
  • kept only the effective code2wav-side stop_token_ids: [0]

4. Fix scheduler cleanup path on the merged branch

Adjusted vllm_omni/core/sched/omni_generation_scheduler.py so the already-finished path no longer routes an already-finished request through finish_requests() (which becomes a no-op upstream).

Current behavior:

  • request_id not in self.requests:
    • remove from running
    • propagate worker-side finished signal
  • RequestStatus.FINISHED_STOPPED without chunk adapter:
    • enqueue into the same already_finished_reqs path
    • remove from running
    • call _free_request() only when scheduler-side state still exists

This avoids both:

  • worker-side No free indices
  • scheduler-side dead-loop / stale-state risk in the non-async-chunk path

5. Clean minor correctness / merge fallout

  • fixed misleading env comment in omni_ar_scheduler.py
  • fixed leftover deserialize_additional_information reference to _resolve_additional_information
  • kept top-level spacing / lint clean in touched scheduler files

Risks & Considerations

This is still a large integration PR because it syncs main into a long-lived migration branch.

Main risks:

  • hidden behavioral differences in test helper restructuring (tests/helpers/*)
  • deploy/schema drift between old MR-V2 assumptions and the new deploy/*.yaml system
  • scheduler cleanup / connector interaction under long-running or rare paths

The mitigations in this PR are:

  • keep main's structure wherever possible
  • minimize dev carry-over to the smallest required behavior delta
  • validate both async-chunk and no-async-chunk TTS service paths remotely

Test Plan

Static / local

  • resolved all git merge conflicts on fix/dev-sync-main-semantic-safe
  • ruff check passed for touched scheduler / MR-V2 example files

Remote service validation (/chrome-remote-gpu)

  • merged-branch Qwen3-TTS 0.6B online serving startup
  • merged-branch Qwen3-TTS 0.6B /v1/audio/speech — 5 concurrent requests, all succeeded
  • merged-branch Qwen3-TTS 0.6B stress — 3 × 10 concurrent requests, all succeeded
  • merged-branch Qwen3-TTS 0.6B --no-async-chunk path — 5 concurrent requests, all succeeded
  • merged-branch MR-V2 example path: examples/offline_inference/qwen3_tts/end2end.py --query-type Base --mode-tag icl returned RC=0
  • merged-branch Qwen3-Omni 30B startup smoke reached stage init / graph capture without immediate schema/config crash

Known limitation

  • full pytest validation on the remote machine is currently blocked by the remote environment's torch / vllm / tests.helpers.fixtures.runtime incompatibility (mm_configs import failure in torch inductor plugin loading)
  • final confidence still requires CI on this branch

Notes

This PR intentionally treats the merge as a semantic integration rather than a mechanical text merge. The goal is to keep main's latest structure intact while preserving the MR-V2 runtime semantics that the migration branch still needs.

cc @tzhouam @Fattysand

yenuo26 and others added 28 commits April 17, 2026 23:10
…_generates_video[wan22_i2v_usp2_hsdp2] (vllm-project#2883)

Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: Lancer <maruixiang6688@gmail.com>
…t#2343)

Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
…ures (vllm-project#1837)

Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>
Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Signed-off-by: linyueqian <linyueqian@outlook.com>
Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Co-authored-by: linyueqian <linyueqian@outlook.com>
Signed-off-by: Joshna Medisetty <joshna.medisetty@intel.com>
Signed-off-by: Joshna-Medisetty <joshna.medisetty@intel.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: Alex Brooks <albrooks@redhat.com>
Signed-off-by: hsliuustc0106 <liuhongsheng4@huawei.com>
Signed-off-by: hsliu <liuhongsheng4@huawei.com>
Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: david6666666 <david6666666@users.noreply.github.com>
Co-authored-by: david6666666 <david6666666@users.noreply.github.com>
Signed-off-by: Nick Cao <ncao@redhat.com>
Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>
Signed-off-by: Lancer <maruixiang6688@gmail.com>
Co-authored-by: Samit <285365963@qq.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
…2383)

Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: reidliu41 <reid201711@gmail.com>
Signed-off-by: Alex Brooks <albrooks@redhat.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: xiaohajiayou <75477391+xiaohajiayou@users.noreply.github.com>
Co-authored-by: Alex Brooks <albrooks@redhat.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
…+decode batches (vllm-project#2903)

Signed-off-by: Sy03 <1370724210@qq.com>
…memory (vllm-project#2474)

Signed-off-by: willamhou <willamhou@ceresman.com>
Co-authored-by: willamhou <willamhou@ceresman.com>
Signed-off-by: xiaohajiayou <923390377@qq.com>
Signed-off-by: Samit <285365963@qq.com>
Co-authored-by: Samit <285365963@qq.com>
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
…m-project#2018)

Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
Co-authored-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: Rein Yang <ruiruyang2@gmail.com>
…resolution

Signed-off-by: Sy03 <1370724210@qq.com>
Signed-off-by: Sy03 <1370724210@qq.com>
Signed-off-by: Sy03 <1370724210@qq.com>
@Sy0307 Sy0307 marked this pull request as ready for review April 20, 2026 13:28
@Sy0307 Sy0307 requested a review from hsliuustc0106 as a code owner April 20, 2026 13:28
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@tzhouam tzhouam merged commit aeb22fe into vllm-project:dev/migrate-MR-v2 Apr 20, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.