[BREAKING][worker, rollout, vllm] feat: implement vLLM colocated training-inference rollout with process separation by jianjunzhong · Pull Request #4280 · verl-project/verl

jianjunzhong · 2025-11-25T03:19:21Z

What does this PR do?

Refactor vLLM co-located training-inference rollout from single-process to multi-process architecture. This refactoring separates training and inference into different processes, enabling better resource isolation and paving the way for future checkpoint-engine integration (in roadmap #3624).

Key Changes:

Transform vLLMAsyncRollout into ServerAdapter - a client-side adapter that communicates with the inference executor
Remove ExternalZeroMQDistributedExecutor and use MultiprocExecutor as the inference backend
Implement CUDA IPC-based weight updates via ZeroMQ for efficient parameter synchronization between training and inference processes

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: ...
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data
- If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

This refactoring maintains full backward compatibility with existing vLLM rollout APIs. No changes are required to user code.

Key API Components:

ServerAdapter (replaces vLLMAsyncRollout):
- Acts as client-side adapter for communicating with inference executor
- Manages CUDA IPC-based weight updates
- Provides same interface as previous vLLMAsyncRollout class

Design

Architecture Overview

Before (Single-Process Architecture)

Single-Process Design

In the original AsyncActorRolloutRefWorker, the training engine and inference engine shared the same process. The vLLM inference engine directly received weight updates through parameter passing.

Communication Architecture

ExternalZeroMQDistributedExecutor acts as a client, sending instructions to all AsyncActorRolloutRefWorker inference engines via ZMQ to execute operations like init_worker, load_model, init_device, and generate. Operations like wake_up, sleep, and weight updates were executed directly in vLLMAsyncRollout without going through ExternalZeroMQDistributedExecutor.

After (Multi-Process Architecture):

Multi-Process Design

Transform vLLMAsyncRollout into ServerAdapter, serving as a client for communicating with the inference engine (AsyncLLM). Weight updates are based on CUDA IPC, passing through ZeroMQ to the inference engine.

Communication Architecture

Deprecate the original ExternalZeroMQDistributedExecutor class and directly use vllm's MultiprocExecutor by passing distributed_executor_backend = "mp". All inference engine operations are uniformly broadcast to all inference workers through MultiprocExecutor's RPC Broadcast MQ.

Convergence test

GPU

model: Qwen3-VL-30B-A3B-Instruct
dataset: geo3k
GPU: 4*8 H100

NPU

model: Qwen3-8B
dataset: gsm8k
NPU: 1*8 910C
Ascend HDK: 25.3.RC1.2
CANN: 8.3.RC2
vLLM-Ascend: 0.13.0rc1
training backend: FSDP

Performance test: update weights

CUDA IPC bucket_size: 2GB
GPU: H100, ConnectX-7 400 Gbps (InfiniBand)

Model	#GPU	Parallelism	Time
Qwen3-VL-30B-A3B-Instruct	TP2,EP8	4*8	5s
DeepSeek-V3.1-Terminus	TP8, PP16, EP8	16*8	120s
DeepSeek-V3.1-Terminus	TP16,PP16	32*8	80s

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)

…ss separation Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

pengwu22 · 2026-01-23T02:54:00Z

verl/workers/rollout/vllm_rollout/vllm_rollout.py

-                lora_path=VLLM_LORA_PATH,
-                peft_config=asdict(peft_config),
-                lora_tensors=weights,
+        # build cuda ipc buffer


hi. thank you for the contribution! just one comment: could you abstract the zmq+ipc communication channel out, and add some corresponding unittests please?

…ning-inference rollout with process separation (verl-project#4280) ### What does this PR do? Refactor vLLM co-located training-inference rollout from single-process to multi-process architecture. This refactoring separates training and inference into different processes, enabling better resource isolation and paving the way for future checkpoint-engine integration (in roadmap verl-project#3624). **Key Changes:** - Transform `vLLMAsyncRollout` into `ServerAdapter` - a client-side adapter that communicates with the inference executor - Remove `ExternalZeroMQDistributedExecutor` and use `MultiprocExecutor` as the inference backend - Implement CUDA IPC-based weight updates via ZeroMQ for efficient parameter synchronization between training and inference processes ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example This refactoring maintains full backward compatibility with existing vLLM rollout APIs. No changes are required to user code. **Key API Components:** * **ServerAdapter** (replaces `vLLMAsyncRollout`): - Acts as client-side adapter for communicating with inference executor - Manages CUDA IPC-based weight updates - Provides same interface as previous `vLLMAsyncRollout` class ### Design #### Architecture Overview 1. Before (Single-Process Architecture) * Single-Process Design In the original `AsyncActorRolloutRefWorker`, the training engine and inference engine shared the same process. The vLLM inference engine directly received weight updates through parameter passing. ![single](https://github.com/user-attachments/assets/c3ff858f-f33e-4eb7-98c5-083c5b679d62) * Communication Architecture `ExternalZeroMQDistributedExecutor` acts as a client, sending instructions to all `AsyncActorRolloutRefWorker` inference engines via ZMQ to execute operations like `init_worker`, `load_model`, `init_device`, and `generate`. Operations like `wake_up`, `sleep`, and weight updates were executed directly in `vLLMAsyncRollout` without going through `ExternalZeroMQDistributedExecutor`. ![single_comm](https://github.com/user-attachments/assets/2be913c0-9b87-4281-bac2-1460e946b702) 2. After (Multi-Process Architecture): * Multi-Process Design Transform `vLLMAsyncRollout` into `ServerAdapter`, serving as a client for communicating with the inference engine (AsyncLLM). Weight updates are based on CUDA IPC, passing through ZeroMQ to the inference engine. ![multi](https://github.com/user-attachments/assets/51102b97-f74b-4cda-8a56-5effd2c64539) * Communication Architecture Deprecate the original `ExternalZeroMQDistributedExecutor` class and directly use vllm's `MultiprocExecutor` by passing `distributed_executor_backend = "mp"`. All inference engine operations are uniformly broadcast to all inference workers through `MultiprocExecutor`'s RPC Broadcast MQ. ![multi_comm](https://github.com/user-attachments/assets/4a98cba4-89d0-432e-94dd-040a20877363) ### Convergence test - model: Qwen3-VL-30B-A3B-Instruct - dataset: geo3k - GPU: 4*8 H100 <img width="660" height="618" alt="image" src="https://github.com/user-attachments/assets/6e3e7dbd-03f9-471a-b8d5-bc0344dba299" /> ### Performance test: update weights - CUDA IPC bucket_size: 2GB - GPU: H100, ConnectX-7 400 Gbps (InfiniBand) | Model | #GPU | Parallelism | Time | |---|---|---|---| |Qwen3-VL-30B-A3B-Instruct|TP2,EP8|4*8|5s| |DeepSeek-V3.1-Terminus|TP8, PP16, EP8| 16*8 | 120s | |DeepSeek-V3.1-Terminus|TP16,PP16| 32*8 | 80s| ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com> Co-authored-by: wuxibin <wuxibin@bytedance.com>

### What does this PR do? #4280 refactor vllm breaking `one-step-off-policy` and `fully-async`. This PR introduce CheckpointEngineManager to coordinate weight synchronization between trainer and rollout replicas. Next PR, refactor `one-step-off-policy` and `fully-async` with CheckpointEngineManager. design doc: https://github.com/volcengine/verl/tree/main/verl/checkpoint_engine

…ning-inference rollout with process separation (verl-project#4280) ### What does this PR do? Refactor vLLM co-located training-inference rollout from single-process to multi-process architecture. This refactoring separates training and inference into different processes, enabling better resource isolation and paving the way for future checkpoint-engine integration (in roadmap verl-project#3624). **Key Changes:** - Transform `vLLMAsyncRollout` into `ServerAdapter` - a client-side adapter that communicates with the inference executor - Remove `ExternalZeroMQDistributedExecutor` and use `MultiprocExecutor` as the inference backend - Implement CUDA IPC-based weight updates via ZeroMQ for efficient parameter synchronization between training and inference processes ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example This refactoring maintains full backward compatibility with existing vLLM rollout APIs. No changes are required to user code. **Key API Components:** * **ServerAdapter** (replaces `vLLMAsyncRollout`): - Acts as client-side adapter for communicating with inference executor - Manages CUDA IPC-based weight updates - Provides same interface as previous `vLLMAsyncRollout` class ### Design #### Architecture Overview 1. Before (Single-Process Architecture) * Single-Process Design In the original `AsyncActorRolloutRefWorker`, the training engine and inference engine shared the same process. The vLLM inference engine directly received weight updates through parameter passing. ![single](https://github.com/user-attachments/assets/c3ff858f-f33e-4eb7-98c5-083c5b679d62) * Communication Architecture `ExternalZeroMQDistributedExecutor` acts as a client, sending instructions to all `AsyncActorRolloutRefWorker` inference engines via ZMQ to execute operations like `init_worker`, `load_model`, `init_device`, and `generate`. Operations like `wake_up`, `sleep`, and weight updates were executed directly in `vLLMAsyncRollout` without going through `ExternalZeroMQDistributedExecutor`. ![single_comm](https://github.com/user-attachments/assets/2be913c0-9b87-4281-bac2-1460e946b702) 2. After (Multi-Process Architecture): * Multi-Process Design Transform `vLLMAsyncRollout` into `ServerAdapter`, serving as a client for communicating with the inference engine (AsyncLLM). Weight updates are based on CUDA IPC, passing through ZeroMQ to the inference engine. ![multi](https://github.com/user-attachments/assets/51102b97-f74b-4cda-8a56-5effd2c64539) * Communication Architecture Deprecate the original `ExternalZeroMQDistributedExecutor` class and directly use vllm's `MultiprocExecutor` by passing `distributed_executor_backend = "mp"`. All inference engine operations are uniformly broadcast to all inference workers through `MultiprocExecutor`'s RPC Broadcast MQ. ![multi_comm](https://github.com/user-attachments/assets/4a98cba4-89d0-432e-94dd-040a20877363) ### Convergence test - model: Qwen3-VL-30B-A3B-Instruct - dataset: geo3k - GPU: 4*8 H100 <img width="660" height="618" alt="image" src="https://github.com/user-attachments/assets/6e3e7dbd-03f9-471a-b8d5-bc0344dba299" /> ### Performance test: update weights - CUDA IPC bucket_size: 2GB - GPU: H100, ConnectX-7 400 Gbps (InfiniBand) | Model | #GPU | Parallelism | Time | |---|---|---|---| |Qwen3-VL-30B-A3B-Instruct|TP2,EP8|4*8|5s| |DeepSeek-V3.1-Terminus|TP8, PP16, EP8| 16*8 | 120s | |DeepSeek-V3.1-Terminus|TP16,PP16| 32*8 | 80s| ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com> Co-authored-by: wuxibin <wuxibin@bytedance.com>

…pport checks (#5089) ### What does this PR do? To address the issue of older NPU drivers not supporting weight updates via IPC in #4280, this PR adds support for shared memory for weight updates. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`. --------- Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

…pport checks (verl-project#5089) ### What does this PR do? To address the issue of older NPU drivers not supporting weight updates via IPC in verl-project#4280, this PR adds support for shared memory for weight updates. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`. --------- Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

### What does this PR do? #4280 Revert the default value of vllm max_num_seqs which may effect the throughput ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`.

…caused by multiple PRs (#5100) ### What does this PR do? > Add **concise** overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. **Problem1: ConfigAttributeError('Missing key config\n full_key: config\n object_type=dict')** This error was introduced by this [PR](#5034): `dataset_config` type was changed from `DictConfig` to `DictConfigWrap` in `AgentLoopBase` initialization (using dataset_config.config for passing), but the fully async agentloop failed to update `dataset_config` to `DictConfigWrap`, causing the error. The following two problems were introduced by this [PR](#4280): **Problem2: TypeError: got an unexpected keyword argument 'cuda_visible_devices'** The PR added `cuda_visible_devices` to `vLLMHttpServer`, but its subclass `vLLMHttpServerForPartial` in fully async was not updated accordingly, causing conflicts. **Problem3: KeyError: 'ASCEND_RT_VISIBLE_DEVICES'** The PR references the environment variable `ASCEND_RT_VISIBLE_DEVICES` in `get_device_uuid` but does not handle its absence or set a default value, leading to potential errors. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`.

…caused by multiple PRs (verl-project#5100) ### What does this PR do? > Add **concise** overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. **Problem1: ConfigAttributeError('Missing key config\n full_key: config\n object_type=dict')** This error was introduced by this [PR](verl-project#5034): `dataset_config` type was changed from `DictConfig` to `DictConfigWrap` in `AgentLoopBase` initialization (using dataset_config.config for passing), but the fully async agentloop failed to update `dataset_config` to `DictConfigWrap`, causing the error. The following two problems were introduced by this [PR](verl-project#4280): **Problem2: TypeError: got an unexpected keyword argument 'cuda_visible_devices'** The PR added `cuda_visible_devices` to `vLLMHttpServer`, but its subclass `vLLMHttpServerForPartial` in fully async was not updated accordingly, causing conflicts. **Problem3: KeyError: 'ASCEND_RT_VISIBLE_DEVICES'** The PR references the environment variable `ASCEND_RT_VISIBLE_DEVICES` in `get_device_uuid` but does not handle its absence or set a default value, leading to potential errors. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`.

…ly async / one step off) (#5184) ### What does this PR do? * Add a new Ray Trainer class to facilitate reusing the core logic. * And fix fully async / one step off CI. * Currently, our parameter synchronization logic is still in a broken state. CI break in #4280 ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`.

…caused by multiple PRs (verl-project#5100) ### What does this PR do? > Add **concise** overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review. **Problem1: ConfigAttributeError('Missing key config\n full_key: config\n object_type=dict')** This error was introduced by this [PR](verl-project#5034): `dataset_config` type was changed from `DictConfig` to `DictConfigWrap` in `AgentLoopBase` initialization (using dataset_config.config for passing), but the fully async agentloop failed to update `dataset_config` to `DictConfigWrap`, causing the error. The following two problems were introduced by this [PR](verl-project#4280): **Problem2: TypeError: got an unexpected keyword argument 'cuda_visible_devices'** The PR added `cuda_visible_devices` to `vLLMHttpServer`, but its subclass `vLLMHttpServerForPartial` in fully async was not updated accordingly, causing conflicts. **Problem3: KeyError: 'ASCEND_RT_VISIBLE_DEVICES'** The PR references the environment variable `ASCEND_RT_VISIBLE_DEVICES` in `get_device_uuid` but does not handle its absence or set a default value, leading to potential errors. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`.

…ly async / one step off) (verl-project#5184) ### What does this PR do? * Add a new Ray Trainer class to facilitate reusing the core logic. * And fix fully async / one step off CI. * Currently, our parameter synchronization logic is still in a broken state. CI break in verl-project#4280 ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [ ] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`.

feat: implement vLLM co-located training-inference rollout with proce…

714a32f

…ss separation Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

jianjunzhong force-pushed the refactor/vllm_sep_proc branch from 51c8ad9 to 714a32f Compare November 27, 2025 14:59

jianjunzhong changed the title ~~[BREAKING][worker, rollout, vllm] feat: implement vLLM co-located training-inference rollout with process separation~~ [WIP][BREAKING][worker, rollout, vllm] feat: implement vLLM co-located training-inference rollout with process separation Nov 28, 2025

jianjunzhong added 14 commits November 28, 2025 09:22

add vLLMWorkerProc

06959df

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

d1cd582

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

b1e954c

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

fab21aa

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

fix and add test cases

d5c11be

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

c994d67

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

a5de2d5

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

e956ca0

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

4680742

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

8a0b4ef

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

d2b7c8b

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

add 'non_block' arg for _execute_method()

6b861a5

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

support async execute method in vLLMRollout

685fd2a

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

add lock for _do_execute()

686891d

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

Shangwei-Li mentioned this pull request Dec 3, 2025

What's the purpose of ExternalZeroMQDistributedExecutor? #4383

Closed

jianjunzhong added 4 commits December 3, 2025 22:50

fix and remove redundant codes

8635731

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

rename vLLMAsyncRollout to ServerAdapter and update class description

38971b6

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

b684a86

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

update

ca088a2

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

jianjunzhong force-pushed the refactor/vllm_sep_proc branch from ba4512b to ca088a2 Compare December 7, 2025 14:44

remove VERL_VLLM_MULTIPROC_RANK_OFFSET, use CUDA_VISIBLE_DEVICES

87b9843

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

wuxibin89 mentioned this pull request Dec 8, 2025

[perf] feat: verl profiler system support Agent Loop scenario and integrate torch.profiler #4320

Merged

7 tasks

jianjunzhong force-pushed the refactor/vllm_sep_proc branch 3 times, most recently from ef46ad3 to 2d5b9f1 Compare December 11, 2025 01:45

jianjunzhong added 2 commits December 11, 2025 15:02

fix cuda invalid device ordinal

638339e

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

remove unnessesary docstring

6e2e0aa

Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>

wuxibin89 added 4 commits January 20, 2026 18:33

non blocking copy

39c36cf

fix ci

f263e53

Merge branch 'main' into refactor/vllm_sep_proc

2808f87

Merge branch 'main' into refactor/vllm_sep_proc

c7d11fb

pengwu22 reviewed Jan 23, 2026

View reviewed changes

wuxibin89 added 3 commits January 23, 2026 21:23

Merge branch 'main' into refactor/vllm_sep_proc

2bfe6ea

temporary disable async ci

b4e6ae3

set vllm cudagraph_mode=FULL_DECODE_ONLY

0b19e64

wuxibin89 approved these changes Jan 23, 2026

View reviewed changes

wuxibin89 merged commit e9405d7 into verl-project:main Jan 23, 2026
1 check passed

wuxibin89 mentioned this pull request Jan 23, 2026

[ckpt] feat: add CheckpointEngineManager #5031

Merged

tardis-key mentioned this pull request Jan 26, 2026

PTA call acl api failed: Parameter verification failed #5046

Closed

4 tasks

tardis-key mentioned this pull request Jan 28, 2026

[RFC] Profiling system in async mode #4207

Open

8 tasks

jianjunzhong mentioned this pull request Jan 28, 2026

[vllm] feat: add shared memory support for weight transfer and IPC support checks #5089

Merged

8 tasks

ZLiao097 mentioned this pull request Jan 29, 2026

[training_utils] fix: Resolved bugs and conflicts in the fully async caused by multiple PRs #5100

Merged

8 tasks

RobotGF mentioned this pull request Jan 30, 2026

[rollout,vllm] feat: revert default value of max_num_seqs #5139

Merged

8 tasks

wuxibin89 mentioned this pull request Feb 3, 2026

[vllm, rollout] fix: implement deferred ZMQ futures for non-blocking executor calls #4875

Closed

7 tasks

ArronHZG mentioned this pull request Feb 5, 2026

[recipe] refactor: refactor ray trainer for separate recipe use. (fully async / one step off) #5184

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[BREAKING][worker, rollout, vllm] feat: implement vLLM colocated training-inference rollout with process separation#4280

[BREAKING][worker, rollout, vllm] feat: implement vLLM colocated training-inference rollout with process separation#4280
wuxibin89 merged 64 commits intoverl-project:mainfrom
jianjunzhong:refactor/vllm_sep_proc

jianjunzhong commented Nov 25, 2025 •

edited

Loading

Uh oh!

pengwu22 Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Comments

Conversation

jianjunzhong commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design

Architecture Overview

Convergence test

Performance test: update weights

Checklist Before Submitting

Uh oh!

pengwu22 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jianjunzhong commented Nov 25, 2025 •

edited

Loading