[sglang, doc] feat: add NPU GRPO training scripts for Qwen3-30B (Megaton/SGLang backends) and update doc by hustmf · Pull Request #5060 · verl-project/verl

hustmf · 2026-01-27T08:46:38Z

What does this PR do?

add NPU GRPO training scripts for Qwen3-30B (Megaton/SGLang backends) and update doc

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: ...
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, megatron, veomni, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward
- If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)
If your PR is related to the recipe submodule, please also update the reference to the submodule commit via git submodule update --remote or cd recipe && git pull origin main.

…te doc

hustmf · 2026-01-28T07:11:18Z

@wuxibin89 @FightingZhen This pr is ready

ji-huazhong · 2026-01-28T08:43:26Z

examples/grpo_trainer/run_qwen3moe-30b_sglang_megatron_npu.sh

+    # Hardware Configuration
+    trainer.nnodes="${NNODES}"
+    trainer.n_gpus_per_node="${NPUS_PER_NODE}"
+    trainer.device='npu'


Suggested change

trainer.device='npu'

verl now supports automatic device configuration, so no longer need to explicitly set trainer.device='npu'.

we will remove it

scripts/install_sglang_mcore_npu.sh

tardis-key · 2026-01-28T10:14:15Z

docs/ascend_tutorial/ascend_sglang_quick_start.rst

+===========+=================+
+| Python    | == 3.11         |
+-----------+-----------------+
+| HDK       | >= 25.3.RC1     |


Is 25.3.RC1 necessary? HDK updating is really challenging in a real production environment.

resharding in sglang is developed based on ipc which is supported in new HDK

tardis-key · 2026-01-28T10:20:43Z

docs/ascend_tutorial/ascend_sglang_quick_start.rst


-基础环境准备
+-----------+-----------------+
+| software  | version         |


Do it match daily image?

same as the DockerFile we provide

examples/grpo_trainer/run_qwen3moe-30b_sglang_megatron_npu.sh

…ton/SGLang backends) and update doc (verl-project#5060) ### What does this PR do? add NPU GRPO training scripts for Qwen3-30B (Megaton/SGLang backends) and update doc ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`.

hustmf requested review from FightingZhen, PeterSH6, ji-huazhong, tardis-key and vermouth1992 as code owners January 27, 2026 08:46

hustmf changed the title ~~[sglang, doc] feat: add NPU GRPO training scripts for Qwen2.5-32B/Qwen3-30B (Megaton/SGLang backends) and update doc~~ [sglang, doc] feat: add NPU GRPO training scripts for Qwen3-30B (Megaton/SGLang backends) and update doc Jan 27, 2026

hustmf force-pushed the sglang-pr branch from 1c9c092 to cc69655 Compare January 27, 2026 08:59

hustmf requested review from eric-haibin-lin and zhaochenyang20 as code owners January 27, 2026 12:42

add sglang GRPO training script for Qwen3-30B sglang backend and upda…

d1b80a4

…te doc

hustmf force-pushed the sglang-pr branch from e0a1946 to d1b80a4 Compare January 28, 2026 05:03

ji-huazhong reviewed Jan 28, 2026

View reviewed changes

tardis-key reviewed Jan 28, 2026

View reviewed changes

examples/grpo_trainer/run_qwen3moe-30b_sglang_megatron_npu.sh Show resolved Hide resolved

wucong25 approved these changes Jan 29, 2026

View reviewed changes

wucong25 merged commit c98cb8c into verl-project:main Jan 29, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[sglang, doc] feat: add NPU GRPO training scripts for Qwen3-30B (Megaton/SGLang backends) and update doc#5060

[sglang, doc] feat: add NPU GRPO training scripts for Qwen3-30B (Megaton/SGLang backends) and update doc#5060
wucong25 merged 1 commit intoverl-project:mainfrom
hustmf:sglang-pr

hustmf commented Jan 27, 2026 •

edited

Loading

Uh oh!

hustmf commented Jan 28, 2026

Uh oh!

ji-huazhong Jan 28, 2026

Uh oh!

hustmf Jan 28, 2026

Uh oh!

Uh oh!

tardis-key Jan 28, 2026

Uh oh!

hustmf Jan 29, 2026

Uh oh!

tardis-key Jan 28, 2026

Uh oh!

hustmf Jan 29, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

hustmf commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

hustmf commented Jan 28, 2026

Uh oh!

ji-huazhong Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

hustmf Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tardis-key Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

hustmf Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

tardis-key Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

hustmf Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hustmf commented Jan 27, 2026 •

edited

Loading