[Bugfix] Sync main into dev/migrate-MR-v2 and fix build errors by Sy0307 · Pull Request #2923 · vllm-project/vllm-omni

Sy0307 · 2026-04-19T18:29:45Z

Background

Build #7230 on dev/migrate-MR-v2 surfaced three regressions, and the branch had also drifted behind main by 17 commits. This PR does both: sync main into the branch and fix the three failures.

Changes

1. Merge `origin/main` into `dev/migrate-MR-v2`

Resolves:

Import conflict in vllm_omni/core/sched/omni_generation_scheduler.py (HEAD added import os, main added from __future__ import annotations — keep both).
Modify/delete of five vllm_omni/model_executor/stage_configs/*.yaml files. Main migrated them to vllm_omni/deploy/*.yaml under PR [Config Refactor][2/N] Pipeline + Deploy Config Schema #2383's schema refactor. Dev's only change to those files was adding stop_token_ids / detokenize to default_sampling_params; those values are carried over into the new deploy/qwen3_tts.yaml and deploy/qwen3_omni_moe.yaml.

This pull in also resolves CI failure #5 (simple-unit-test: stage_configs property has no setter) — PR #2884 on main already fixed the FakeAsyncOmniClass fixture.

2. `[BugFix] Add Qwen2_5Omni to test_init_model_state expected set`

Fixes CI failure #3 (modelrunner-v2-unit-test: test_omni_architectures_set_contains_expected). The expected set hardcoded in the test was out of sync with _OMNI_ARCHITECTURES.

3. `[BugFix] Fix MTP buffer size mismatch for Omni Talker models`

Fixes CI failure #2 (full-moon-omni-star-doc-test-with-h100: size of tensor a (2048) must match size of tensor b (1024)).

OmniModelState allocated its MTP and static inputs_embeds buffers with self.inputs_embeds_size (= hf_text_config.hidden_size = 2048 for Qwen3-Omni Thinker). But Talker stages replace embed_tokens with codec_embedding whose output dim is 1024. Probe the real dim once at init via model.embed_input_ids(dummy) and use that for buffer allocation.

4. `[BugFix] Propagate finished_req_ids for already_finished_reqs`

Fixes CI failure #4 (qwen3-tts-base-e2e-test-modelrunner-v2: Orchestrator thread crashed / No free indices).

The already_finished_reqs branch in OmniGenerationScheduler.schedule() only removed requests from the running queue but never added them to self.finished_req_ids. So the worker never got the finished signal, never released the corresponding req_state slots, and the next new request hit AssertionError: No free indices in req_states.add_request. Propagate to both self.finished_req_ids and self.finished_req_ids_dict to match upstream _free_request behavior.

Test Plan

tests/worker_v2/test_init_model_state.py — 5/5 passed
tests/core/sched/test_generation_scheduler_{finish_condition,restore}.py — 11/11 passed
tests/worker_v2/ tests/core/sched/ full — 101/101 passed
Qwen3-TTS 0.6B server + 5 concurrent /v1/audio/speech requests — all OK
Qwen3-TTS 0.6B server + 3×10 concurrent stress — 30/30 OK, no regressions
Full CI (wait for build)

CI Failure Mapping

Build #7230 job	Root cause	Fix in this PR
`full-moon-diffusion-x2v-star-accuracy-test`	pre-existing flaky test	Already skipped by PR #2883 on main (merged in)
`full-moon-omni-star-doc-test-with-h100`	MTP buffer 2048 vs 1024	Commit 3 (MTP)
`modelrunner-v2-unit-test`	test expected set stale	Commit 2 (test)
`qwen3-tts-base-e2e-test-modelrunner-v2`	No free indices race	Commit 4 (scheduler)
`simple-unit-test`	stage_configs `@property` setter	Fixed on main by #2884 (merged in)

…_generates_video[wan22_i2v_usp2_hsdp2] (vllm-project#2883) Signed-off-by: wangyu <410167048@qq.com>

Signed-off-by: Lancer <maruixiang6688@gmail.com>

…t#2343) Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

…ures (vllm-project#1837) Signed-off-by: CHEN <116010019@link.cuhk.edu.cn> Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com> Signed-off-by: linyueqian <linyueqian@outlook.com> Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com> Co-authored-by: linyueqian <linyueqian@outlook.com>

Signed-off-by: Joshna Medisetty <joshna.medisetty@intel.com> Signed-off-by: Joshna-Medisetty <joshna.medisetty@intel.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

Signed-off-by: Alex Brooks <albrooks@redhat.com>

Signed-off-by: hsliuustc0106 <liuhongsheng4@huawei.com> Signed-off-by: hsliu <liuhongsheng4@huawei.com> Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Signed-off-by: david6666666 <david6666666@users.noreply.github.com> Co-authored-by: david6666666 <david6666666@users.noreply.github.com>

Signed-off-by: Nick Cao <ncao@redhat.com>

…t#2581) Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Samit <285365963@qq.com>

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

…pt (vllm-project#2894) Signed-off-by: Sy03 <1370724210@qq.com>

…2383) Signed-off-by: lishunyang <lishunyang12@163.com> Signed-off-by: reidliu41 <reid201711@gmail.com> Signed-off-by: Alex Brooks <albrooks@redhat.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Co-authored-by: xiaohajiayou <75477391+xiaohajiayou@users.noreply.github.com> Co-authored-by: Alex Brooks <albrooks@redhat.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

…+decode batches (vllm-project#2903) Signed-off-by: Sy03 <1370724210@qq.com>

…memory (vllm-project#2474) Signed-off-by: willamhou <willamhou@ceresman.com> Co-authored-by: willamhou <willamhou@ceresman.com>

Signed-off-by: xiaohajiayou <923390377@qq.com> Signed-off-by: Samit <285365963@qq.com> Co-authored-by: Samit <285365963@qq.com> Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>

…m-project#2018) Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com> Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com> Co-authored-by: Didan Deng <33117903+wtomin@users.noreply.github.com>

Signed-off-by: lishunyang <lishunyang12@163.com>

Resolves: - omni_generation_scheduler.py import conflict - stage_configs/*.yaml migrated to vllm_omni/deploy/ (stop_token_ids and detokenize carried over from dev) Signed-off-by: Sy03 <1370724210@qq.com>

PR vllm-project#2819 added Qwen2_5OmniForConditionalGeneration to _OMNI_ARCHITECTURES but did not update the corresponding unit test, causing test_omni_architectures_set_contains_expected to fail on both simple-unit-test and modelrunner-v2-unit-test CI jobs. Signed-off-by: Sy03 <1370724210@qq.com>

Talker models replace embed_tokens with codec_embedding whose dim may differ from hf_text_config.hidden_size. The MTP static buffers were allocated using self.inputs_embeds_size (= hf_text_config), causing RuntimeError when .copy_() encounters a shape mismatch (e.g. buffer=2048 vs actual embed dim=1024). Probe the model's actual embedding dim via embed_input_ids() at init time instead of relying on hf_text_config.hidden_size. Signed-off-by: Sy03 <1370724210@qq.com>

The already_finished_reqs branch in OmniGenerationScheduler.schedule() only removed requests from the running queue but never added them to self.finished_req_ids. This meant the worker never received the finished signal and never released the corresponding req_state slots, triggering AssertionError: No free indices in req_states.add_request when a subsequent new request tried to claim a slot. Propagate finished ids to both self.finished_req_ids (single-client path) and self.finished_req_ids_dict (multi-client path) to match the upstream _free_request behavior. Signed-off-by: Sy03 <1370724210@qq.com>

Signed-off-by: Sy03 <1370724210@qq.com>

chatgpt-codex-connector · 2026-04-19T18:29:49Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Sy0307 · 2026-04-19T18:43:26Z

@tzhouam PTAK and DCO error is due to merge so plz dismiss them.

yenuo26 and others added 26 commits April 17, 2026 23:10

[CI] Skip test_bagel[parallel_tp_2] and test_wan22_i2v_online_serving…

b4add5b

…_generates_video[wan22_i2v_usp2_hsdp2] (vllm-project#2883) Signed-off-by: wangyu <410167048@qq.com>

[Bugfix] fix CI failure (vllm-project#2884)

64d368d

Signed-off-by: Lancer <maruixiang6688@gmail.com>

[Cleanup] Remove dead runtime.defaults config parameters (vllm-projec…

f2edb81

…t#2343) Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

Nextstep online e2e (vllm-project#2107)

b5ddff7

Signed-off-by: Joshna Medisetty <joshna.medisetty@intel.com> Signed-off-by: Joshna-Medisetty <joshna.medisetty@intel.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

Add Teacache Support for LongCat Image (vllm-project#1487)

f346f2f

Signed-off-by: Alex Brooks <albrooks@redhat.com>

[Docs] Update WeChat QR code for community support (vllm-project#2895)

4f71f73

Signed-off-by: david6666666 <david6666666@users.noreply.github.com> Co-authored-by: david6666666 <david6666666@users.noreply.github.com>

[Refactor] Remove resampy dependency (vllm-project#2891)

d2c23d7

Signed-off-by: Nick Cao <ncao@redhat.com>

[Feature]Support audio streaming input and output-phase2 (vllm-projec…

4124a1f

…t#2581) Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

[BugFix]: Fix multi-stage cfg bug (vllm-project#2801)

768931e

Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

[doc][skip ci] remove redundant content in readme (vllm-project#2901)

fe6cec6

Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

[Feat] cache-dit for GLM-Image (vllm-project#1399)

9cf1fe7

Signed-off-by: Lancer <maruixiang6688@gmail.com> Co-authored-by: Samit <285365963@qq.com>

[Agent] Add NPU main2main skill (vllm-project#2858)

9313f37

Signed-off-by: gcanlin <canlinguosdu@gmail.com>

[Bugfix][VoxCPM2] Fix voice-clone decode loop by padding prefill prom…

a683b1d

…pt (vllm-project#2894) Signed-off-by: Sy03 <1370724210@qq.com>

[Bugfix][VoxCPM2]: Fix vectorized_gather OOB under concurrent prefill…

26edc7f

…+decode batches (vllm-project#2903) Signed-off-by: Sy03 <1370724210@qq.com>

perf(helios): replace strided RoPE with stack+flatten for contiguous …

1568451

…memory (vllm-project#2474) Signed-off-by: willamhou <willamhou@ceresman.com> Co-authored-by: willamhou <willamhou@ceresman.com>

[Config Refactor 2.5/N] Centralize pipeline registry (vllm-project#2915)

cd384d9

Signed-off-by: lishunyang <lishunyang12@163.com>

Merge origin/main into dev/migrate-MR-v2

a2f4c57

Resolves: - omni_generation_scheduler.py import conflict - stage_configs/*.yaml migrated to vllm_omni/deploy/ (stop_token_ids and detokenize carried over from dev) Signed-off-by: Sy03 <1370724210@qq.com>

Condense comments in MTP and scheduler fixes

f7bada9

Signed-off-by: Sy03 <1370724210@qq.com>

Sy0307 requested a review from hsliuustc0106 as a code owner April 19, 2026 18:29

tzhouam merged commit 80441ca into vllm-project:dev/migrate-MR-v2 Apr 20, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Sync main into dev/migrate-MR-v2 and fix build errors#2923

[Bugfix] Sync main into dev/migrate-MR-v2 and fix build errors#2923
tzhouam merged 26 commits intovllm-project:dev/migrate-MR-v2from
Sy0307:fix/v2-sync-main-and-build-7230-bugs

Sy0307 commented Apr 19, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 19, 2026

Uh oh!

Sy0307 commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

Conversation

Sy0307 commented Apr 19, 2026

Background

Changes

1. Merge origin/main into dev/migrate-MR-v2

2. [BugFix] Add Qwen2_5Omni to test_init_model_state expected set

3. [BugFix] Fix MTP buffer size mismatch for Omni Talker models

4. [BugFix] Propagate finished_req_ids for already_finished_reqs

Test Plan

CI Failure Mapping

Uh oh!

chatgpt-codex-connector Bot commented Apr 19, 2026

Uh oh!

Sy0307 commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

1. Merge `origin/main` into `dev/migrate-MR-v2`

2. `[BugFix] Add Qwen2_5Omni to test_init_model_state expected set`

3. `[BugFix] Fix MTP buffer size mismatch for Omni Talker models`

4. `[BugFix] Propagate finished_req_ids for already_finished_reqs`