Skip to content

Adapts Qwen3-Omni PD disaggregation to current config-refactor#2947

Open
spencerr221 wants to merge 16 commits intovllm-project:mainfrom
spencerr221:adapt_config
Open

Adapts Qwen3-Omni PD disaggregation to current config-refactor#2947
spencerr221 wants to merge 16 commits intovllm-project:mainfrom
spencerr221:adapt_config

Conversation

@spencerr221
Copy link
Copy Markdown
Contributor

Purpose

This PR adapts Qwen3-Omni PD (prefill-decode) separation to the new config-refactor flow introduced by #2383, without adding a new deploy YAML.

It adds pd_separation support to the deploy-based config pipeline so the existing vllm_omni/deploy/qwen3_omni_moe.yaml can dynamically expand the original 3-stage Qwen3-Omni pipeline into a 4-stage PD layout at merge time. When enabled, the thinker stage is split into prefill and decode stages, downstream stage IDs and connectors are remapped, and KV transfer settings are injected from deploy config.

This keeps the new pipeline+deploy model intact, preserves the existing PD detection/runtime logic, and avoids introducing a separate PD-specific config file after the config refactor.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@amy-why-3459
Copy link
Copy Markdown
Contributor

Can you add PD separation performance test cases?

Comment thread vllm_omni/deploy/qwen3_omni_moe.yaml Outdated
Comment thread vllm_omni/deploy/qwen3_omni_moe.yaml Outdated
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the docs as well

Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOCKING:

  • Gate Check — pre-commit is failing. Please fix pre-commit issues before proceeding with review.

Comment thread vllm_omni/deploy/qwen3_omni_moe.yaml Outdated
Comment thread tests/e2e/online_serving/test_qwen3_omni.py Outdated
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOCKING:

  • Gate Check — pre-commit still failing (end-of-file-fixer on test_qwen3_omni_expansion.py, E501 line too long in stage_config.py:757, ruff format). Please run pre-commit run --all-files locally and push the fix.

Non-blocking:

  • PR description Test Plan / Test Result sections are empty. Please add CI results or local run output showing the PD expansion tests pass (e.g. test_pd_disaggregation.py, test_config_factory.py::test_merge_pipeline_deploy_with_pd_disaggregation).

@spencerr221 spencerr221 force-pushed the adapt_config branch 3 times, most recently from fd9136c to cd5e8d0 Compare April 23, 2026 10:03
@Gaohan123 Gaohan123 added ready label to trigger buildkite CI omni-test label to trigger buildkite omni model test in nightly CI labels Apr 23, 2026
Comment thread tests/e2e/online_serving/test_qwen3_omni_expansion.py
Comment thread tests/entrypoints/test_pd_disaggregation.py
@spencerr221 spencerr221 changed the title adapts Qwen3-Omni PD (prefill-decode) separation to current config-refactor Adapts Qwen3-Omni PD disaggregation to current config-refactor Apr 24, 2026
@Gaohan123 Gaohan123 added this to the v0.20.0 milestone Apr 24, 2026
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
…n test.

Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: LiuBingyu <liubingyu62@gmail.com>
Signed-off-by: Bingyu (Spencer) Liu <liubingyu62@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

omni-test label to trigger buildkite omni model test in nightly CI ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants