[env, sglang] feat: Bump new sglang version to fix vlm OOM by PopSoda2002 · Pull Request #3183 · verl-project/verl

PopSoda2002 · 2025-08-22T17:46:23Z

What does this PR do?

Bump new version of sglang
This version's sglang can fix vlm OOM issue, detail are in: [Bug] [Tracking] VLM/LLM OOM related issues sgl-project/sglang#9365

Test

Using instruction following https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/rlhf/verl/multi-turn/release_log/latest_sglang.md

Now we have new version of sglang:

gsm8k:
using verl/examples/sglang_multiturn/run_qwen2.5-3b_gsm8k_multiturn.sh
Wandb

It can work well.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)

hebiao064 · 2025-08-24T01:41:36Z

docker/Dockerfile.sglang


-# Install sglang-0.4.6.post5 and torch-memory-saver
-RUN pip uninstall -y cuda-python && pip install "sglang[all]==0.4.6.post5" --no-cache-dir --find-links https://flashinfer.ai/whl/cu124/torch2.6/flashinfer-python && pip install torch-memory-saver --no-cache-dir
+# Install sglang-0.4.10.post2 and torch-memory-saver


I think we can deprecate this docker file, and please create a new dockerfile under this folder:
https://github.com/volcengine/verl/tree/main/docker/verl0.5-cu126-torch2.7-fa2.7.4

cc @ocss884 @ETOgaosion for awareness

I think that we can keep sglang community docker file here, since your update progress can be faster than the official image. Or we rename it to docker/Dockerfile.community.sglang or something like this?

@PopSoda2002 This PR can only modify this for faster merging.

zhaochenyang20

Great job

ETOgaosion · 2025-08-25T02:47:54Z

.github/workflows/.deprecate/e2e_ppo_trainer.yml

      HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
    container:
-      image: verlai/verl:app-verl0.5-sglang0.4.9.post6-mcore0.12.2-te2.2
+      image: popsodazhp/verl:app-verl0.5-sglang0.4.10.post2-mcore0.12.2-te2.2


@PopSoda2002 Could you skip changing these CI for now? Currently might have some conflicts and due to our security issue we shall use our own image for CI.

If this PR only has relation with SGLang's community image, it's faster to be merged, and the official support will come afterwards.

This way can be a norm for future sglang's upgrade, community first and official support later soon following the way.

ETOgaosion · 2025-08-25T02:49:28Z

docker/Dockerfile.sglang


-# Install sglang-0.4.6.post5 and torch-memory-saver
-RUN pip uninstall -y cuda-python && pip install "sglang[all]==0.4.6.post5" --no-cache-dir --find-links https://flashinfer.ai/whl/cu124/torch2.6/flashinfer-python && pip install torch-memory-saver --no-cache-dir
+# Install sglang-0.4.10.post2 and torch-memory-saver


I think that we can keep sglang community docker file here, since your update progress can be faster than the official image. Or we rename it to docker/Dockerfile.community.sglang or something like this?

@PopSoda2002 This PR can only modify this for faster merging.

ETOgaosion · 2025-08-25T02:50:21Z

docker/verl0.5-cu126-torch2.7-fa2.7.4/Dockerfile.app.sglang0.4.10.post2.mcore0.12

@@ -0,0 +1,37 @@
+# Start from the verl base image


We may have a different plan for updating official image, so to avoid conflicts, could skip this.

ETOgaosion · 2025-08-25T02:50:48Z

docs/start/install.rst

 For latest vLLM with FSDP, please refer to `hiyouga/verl <https://hub.docker.com/r/hiyouga/verl>`_ repository and the latest version is ``hiyouga/verl:ngc-th2.6.0-cu126-vllm0.8.4-flashinfer0.2.2-cxx11abi0``.

-For latest SGLang with FSDP, please refer to `hebiaobuaa/verl <https://hub.docker.com/r/hebiaobuaa/verl>`_ repository and the latest version is ``hebiaobuaa/verl:app-verl0.5-sglang0.4.9.post6-mcore0.12.2-te2.2`` which is provided by SGLang RL Group.
+For latest SGLang with FSDP, please refer to `hebiaobuaa/verl <https://hub.docker.com/r/hebiaobuaa/verl>`_ repository and the latest version is ``hebiaobuaa/verl:app-verl0.5-sglang0.4.10.post2-mcore0.12.2-te2.2`` which is provided by SGLang RL Group.


Update this is OK

ETOgaosion · 2025-08-25T02:51:48Z

docs/start/install.rst


 - **vLLM with FSDP and Megatron**: ``verlai/verl:app-verl0.5-vllm0.9.1-mcore0.12.2-te2.2``
- **SGLang with FSDP and Megatron**: ``verlai/verl:app-verl0.5-sglang0.4.9.post6-mcore0.12.2-te2.2``
+- **SGLang with FSDP and Megatron**: ``popsodazhp/verl:app-verl0.5-sglang0.4.10.post2-mcore0.12.2-te2.2``


Please skip update this and add a comment like:

- **SGLang with FSDP and Megatron**: ``verlai/verl:app-verl0.5-sglang0.4.9.post6-mcore0.12.2-te2.2``` - for latest sglang image which fix vlm memory leak issue, please refer to [community image]

ETOgaosion · 2025-08-25T02:52:30Z

requirements_sglang.txt

 transformers
 wandb
-sglang[all]==0.4.9.post6
+sglang[all]==0.4.10.post2


Can skip this first

ETOgaosion · 2025-08-26T02:47:49Z

New image name verlai/verl:app-verl0.5-transformers4.55.4-sglang0.4.10.post2-mcore0.13.0-te2.2, and to see the docker file change: main...ETOgaosion:verl:update_sglang_0.4.10

) ### What does this PR do? - As title - We use set_expandable_segments to resolve memory fragmentation ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

…ject#3175) ### What does this PR do? > Add **concise** overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. This PR adds the dapo baseline in SGLang multi-turn rollout. Basically speaking, the previous DAPO multi-turn baseline with retool doesn't actually converge, since we find that the previous reward of retool is just encouraging the model to generate more turns to call more tools. The answers are not actually correct. In this fix, we (SGLang RL Group) do a manual SFT and make a new model `font-info/qwen3-4b-sft-SGLang-RL` instead of `Qwen/Qwen3-4B-Instruct-2507`. Without finetune, the model can not converge. In the same time, we reduce the default value of minial reward in retool, from 0 to -0.6, `result["score"] = min(-0.6, result["score"] + tool_call_reward)`. Thus, if a model can not generate the correct answer, it will get a score as -0.6, rather than 0. So in our demonstration, we do converge! ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com> Co-authored-by: Zhuorany <yzr1914001753@gmail.com> Co-authored-by: mao cheng <maocheng@berkeley.edu> Co-authored-by: Hecate0821 <hec4te0821@gmail.com> Co-authored-by: maocheng23 <maocheng@berkeley.edu>

verl-project#3196) Fix verl-project#3195 Changes: 1. 🔒 Replace all direct dict[key] access with .get(key, {}) pattern for tool kwargs 2. ✅ Add validation in _preprocess_prompt_to_async_rollout_requests 3. 🧪 New test cases covering: • Missing tool configs • Partial execute_kwargs • Empty tool schemas Impact: • Prevents KeyError crashes when tools/kwargs are missing • Maintains existing flexible tool parameter system • Zero breaking changes to valid configurations

### What does this PR do? > Add **concise** overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. **Adding RL-PLUS to the README as a list of work that used veRL, with only a 1-line change to the README.md.** ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

…ject#3184) ### What does this PR do? - Loading weights in AsyncServer is duplicated and is time-consuming for large models - Use dummy weights instead as the actual weights will be transferred by the trainer ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

### What does this PR do? - Use `sync` mode for `dapo`, `gsm8k` and `geo` ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always`

…project#3194) ### What does this PR do? Fix verl-project#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.

…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.

### What does this PR do? Solve verl-project#3201 #### Problem The existing license check hook scans all directories recursively from a single root directory, which causes issues in local development environments: * Virtual environments (`.venv`, `venv/`) get scanned and fail license checks * No easy way to exclude common build/cache directories without hardcoding exclusions * Different behavior between local development (with venvs) and CI/CD (clean environment) #### Solution Modified the `check_license.py` script to accept multiple target directories instead of a single root directory with exclusions. ### Design & Code Changes Changed argument from `--directory` to `--directories` * Now accepts multiple `Path` arguments using `nargs="+"` * Allows specifying exactly which directories to scan * in local mode: `--directories examples recipe scripts tests verl setup.py` * in github workflow: `--directories .`

…erl-project#3207) Reverts verl-project#3184

### What does this PR do? > Update recipe/dapo/run_dapo_qwen3_30b_npu.sh. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: https://github.com/volcengine/verl/pulls?q=fsdp+npu+30b+recipe - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. Critic/rewards/mean Comparison Chart, where the orange line represents ascend NPU, the pink line represents GPU. <img width="3182" height="1272" alt="image" src="https://github.com/user-attachments/assets/5c275127-6cb3-4bf9-ac89-0fa6abb668c0" /> ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```shell # Add code snippet or script demonstrating how to use this cd /path/to/verl bash recipe/dapo/run_dapo_qwen3_30b_base_npu_fsdp.sh ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) Co-authored-by: Shangwei-Li <lishangwei2@huawei.com>

) ### What does this PR do? - Fix the issue where profiling cannot be collected in discrete mode, for both NPU and nsys. - Adjust the corresponding unit tests accordingly. - Adjust the npu profiler script due to changes in ref.yaml In discrete mode, distribution is handled through the `annotate` class method of the `DistProfiler` class in `verl/utils/profiler/profile.py`. Adjust the `annotat` method of NPUProfiler and NsightSystemsProfiler to be instance method. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

…verl-project#3192)

…ut (verl-project#3211)

…t#3204) Change-Id: Ic0ddfdfa13a38a56571b9c59125e9ebeea5c7802 ### What does this PR do? - Fixed a bug where the original HDFS path was passed due to not using `copy_to_local` when initializing the hf config. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Co-authored-by: wangzhunheng <wangzhunheng@bytedance.com>

### What does this PR do? [doc] fix: fix a documentation typo for nsys ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [X] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [X] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [X] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [X] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

### What does this PR do? Make main ppo script validate config as soon as all needed info is available. this enables the script to fail as fast as possible in case of bug in config. New changes would avoid downloading and loading tokenizer and loading data before validating config solve verl-project#3182 ### Design & Code Changes Isolated config validation in utils (out of PpoRayTrainer) and call it from main_ppo as soon as possible.

### What does this PR do? - Make megatron related print only print on rank zero - Remove unused code in megatron actor - Modularize megatron loss computation so that it can be used for SFT as well ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

CLAassistant · 2025-08-26T03:32:57Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
10 out of 14 committers have signed the CLA.

✅ PopSoda2002
✅ zhaochenyang20
✅ vermouth1992
✅ none0663
✅ slimfrkha
✅ Shangwei-Li
✅ wuxibin89
✅ davidmlw
✅ ETOgaosion
✅ tongtong0613
❌ Zzhiter
❌ ZornWang
❌ looput
❌ YihongDong
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

PopSoda2002 added 2 commits August 22, 2025 17:45

bump sglang version

fe734cc

bump sglang version

15aa1fb

PopSoda2002 mentioned this pull request Aug 22, 2025

sglang is 2.x slower than vllm in param sync #3173

Open

PopSoda2002 added 4 commits August 23, 2025 03:03

Add CI new version

8ebbea9

Modify dockerfile

924045a

Use personal version

0134c01

rollback version

f9e6973

PopSoda2002 marked this pull request as ready for review August 23, 2025 16:16

PopSoda2002 requested review from eric-haibin-lin and zhaochenyang20 as code owners August 23, 2025 16:16

PopSoda2002 changed the title ~~[env, sglang] Bump new sglang version to fix vlm OOM~~ [env, sglang] feat: Bump new sglang version to fix vlm OOM Aug 23, 2025

hebiao064 self-assigned this Aug 24, 2025

hebiao064 reviewed Aug 24, 2025

View reviewed changes

add dockerfile

692158b

zhaochenyang20 approved these changes Aug 25, 2025

View reviewed changes

ETOgaosion reviewed Aug 25, 2025

View reviewed changes

vermouth1992 and others added 13 commits August 26, 2025 03:27

[rollout] fix: add missing extra_reward_info to AgentLoopOuput (verl-…

cc045c7

…project#3194) ### What does this PR do? Fix verl-project#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.

[doc] fix: set use_dist_checkpointing to False for ref model in qwen3…

e256309

…moe-30b script (verl-project#3198) ### What does this PR do? Set use_dist_checkpointing to False for ref model in qwen3moe-30b script, because there is not dist_megatron_ckpt model path for ref model.

Revert "[rollout] feat: use dummy load_format when init AsyncServer" (v…

0333425

…erl-project#3207) Reverts verl-project#3184

[docker] feat: update to vllm 0.10.0, mcore 0.13, transformers 4.55.4 (…

3744a35

…verl-project#3192)

looput and others added 7 commits August 26, 2025 03:29

[data] fix: update parquet_files type check to support multi-file inp…

52eaf98

…ut (verl-project#3211)

Add CI new version

ba4c4ce

Use personal version

67f01d4

PopSoda2002 requested review from PeterSH6, SwordFaith, chenhaiq, tongyx361, vermouth1992 and wuxibin89 as code owners August 26, 2025 03:32

PopSoda2002 closed this Aug 26, 2025

hebiao064 mentioned this pull request Aug 26, 2025

[env, sglang] feat: Bump new sglang version to fix vlm OOM #3216

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[env, sglang] feat: Bump new sglang version to fix vlm OOM#3183

[env, sglang] feat: Bump new sglang version to fix vlm OOM#3183
PopSoda2002 wants to merge 27 commits intoverl-project:mainfrom
PopSoda2002:feat/bump_sglang_version

PopSoda2002 commented Aug 22, 2025 •

edited

Loading

Uh oh!

hebiao064 Aug 24, 2025

Uh oh!

ETOgaosion Aug 25, 2025 •

edited

Loading

Uh oh!

zhaochenyang20 left a comment

Uh oh!

ETOgaosion Aug 25, 2025

Uh oh!

ETOgaosion Aug 25, 2025 •

edited

Loading

Uh oh!

ETOgaosion Aug 25, 2025

Uh oh!

ETOgaosion Aug 25, 2025

Uh oh!

ETOgaosion Aug 25, 2025 •

edited

Loading

Uh oh!

ETOgaosion Aug 25, 2025

Uh oh!

ETOgaosion commented Aug 26, 2025

Uh oh!

CLAassistant commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

Conversation

PopSoda2002 commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test

Checklist Before Submitting

Uh oh!

hebiao064 Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

ETOgaosion Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhaochenyang20 left a comment

Choose a reason for hiding this comment

Uh oh!

ETOgaosion Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

ETOgaosion Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ETOgaosion Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

ETOgaosion Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

ETOgaosion Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ETOgaosion Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

ETOgaosion commented Aug 26, 2025

Uh oh!

CLAassistant commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

PopSoda2002 commented Aug 22, 2025 •

edited

Loading

ETOgaosion Aug 25, 2025 •

edited

Loading

ETOgaosion Aug 25, 2025 •

edited

Loading

ETOgaosion Aug 25, 2025 •

edited

Loading