[Docs] CLI Docs updates#2978
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
LGTM. @princepride PTAL as it touched bagel's code. |
There was a problem hiding this comment.
Pull request overview
Updates vLLM-Omni documentation and test helpers to better document and exercise the “stage-based CLI” (one stage per process) and to clarify when to use --deploy-config vs the legacy --stage-configs-path.
Changes:
- Added stage-based CLI quickstart/quick-reference docs across CLI docs, stage config docs, and model example pages (Qwen3-Omni, BAGEL).
- Updated BAGEL examples to use
--omni-master-address/--omni-master-port(instead of-oma/-omp) and documented legacy vs migrated config flags. - Updated test runtime helpers to choose
--deploy-configvs--stage-configs-pathbased on YAML schema, with new unit tests covering the selection.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
tests/helpers/test_runtime.py |
Adds unit tests for the new CLI-flag selection helper and stage CLI command building. |
tests/helpers/runtime.py |
Introduces get_server_config_cli_args() and uses it in OmniServerStageCli command construction. |
tests/helpers/fixtures/runtime.py |
Uses get_server_config_cli_args() so fixtures exercise the correct user-facing CLI flag. |
examples/online_serving/qwen3_omni/README.md |
Adds stage-based CLI instructions and guidance on when to use --deploy-config vs --stage-overrides. |
examples/online_serving/bagel/run_server_stage_cli.sh |
Switches to long-form --omni-master-* flags for stage-based launch script. |
examples/online_serving/bagel/README.md |
Clarifies BAGEL remains legacy stage_args and updates stage-based CLI instructions accordingly. |
docs/user_guide/examples/online_serving/qwen3_omni.md |
Mirrors Qwen3-Omni stage-based CLI docs in rendered documentation. |
docs/user_guide/examples/online_serving/bagel.md |
Mirrors BAGEL stage-based CLI + legacy config flag guidance in rendered documentation. |
docs/configuration/stage_configs.md |
Adds a stage-based CLI quick reference and updates examples to prefer --deploy-config for new schema. |
docs/cli/serve.md |
Adds a stage-based CLI quickstart section to the serve CLI docs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I have a pr will remove bagel's stage yaml file: #2936 |
|
@wuhang2014 Please coordinate with @princepride |
There was a problem hiding this comment.
Pull request overview
Updates documentation and example scripts to describe and demonstrate the stage-based CLI workflow for multi-stage Omni serving, and standardizes the master address/port flags across docs/examples.
Changes:
- Add stage-based CLI quickstart and guidance (including
--stage-id, master address/port, and when to use--stage-overrides). - Update BAGEL stage-based launcher script/docs to use
--omni-master-address/--omni-master-portinstead of-oma/-omp. - Refresh Qwen3-Omni and stage-config documentation around default deploy YAML resolution and override patterns.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/online_serving/qwen3_omni/README.md | Adds stage-based CLI launch instructions and deploy-config guidance for Qwen3-Omni. |
| examples/online_serving/bagel/run_server_stage_cli.sh | Switches master flags to long-form --omni-master-* in the stage CLI script. |
| examples/online_serving/bagel/README.md | Updates BAGEL multi-node instructions and flag names for stage-based runs. |
| docs/user_guide/examples/online_serving/qwen3_omni.md | Mirrors Qwen3-Omni stage-based CLI instructions into the user guide. |
| docs/user_guide/examples/online_serving/bagel.md | Mirrors BAGEL stage-based launch flag updates into the user guide. |
| docs/configuration/stage_configs.md | Adds stage-based CLI section and clarifies deploy-config vs stage-configs-path usage. |
| docs/cli/serve.md | Adds a stage-based CLI quickstart section to the serve CLI docs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ## Stage-based CLI quickstart | ||
|
|
||
| The stage-based CLI is designed for deployments that require launching each pipeline stage in an isolated process | ||
| (e.g., across separate operating system processes, distinct GPUs, or distributed hosts). | ||
|
|
||
| - For **migrated models** that utilize the bundled deployment YAML configurations located in |
8ba003b to
fd948e6
Compare
| If you have a custom stage configs file, launch the server with the command below: | ||
|
|
||
| ```bash | ||
| vllm serve ByteDance-Seed/BAGEL-7B-MoT --omni --port 8091 --stage-configs-path /path/to/stage_configs_file |
There was a problem hiding this comment.
do we still need to keep this?
There was a problem hiding this comment.
#2936 by @princepride will handle docs relative with Bagel
| @@ -12,21 +12,80 @@ Please refer to [README.md](../../../README.md) | |||
| vllm serve Qwen/Qwen3-Omni-30B-A3B-Instruct --omni --port 8091 | |||
There was a problem hiding this comment.
can you update the qwen3_omni recipe? we will rm model specific examples here and only keep x2y.py for different modalities
There was a problem hiding this comment.
Sure, I've updated recipes of qwen3_omni.
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
|
LGTM |
Resolve conflicts in docs/user_guide/examples/online_serving/bagel.md and examples/online_serving/bagel/README.md by keeping the restructured --deploy-config docs from this PR and dropping the stage-configs-path references reintroduced by upstream vllm-project#2978. Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
#1462
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)