[rollout] fix: delete problematic assert for max_tokens <= response_length in multi-turn scenario by PeterSH6 · Pull Request #4668 · verl-project/verl

PeterSH6 · 2025-12-25T08:29:55Z

What does this PR do?

In multi-turn conversations, the length of prompts_id can exceed the original max_prompt_length, causing the current assertion to fail. This occurs because the accumulated prompt tokens from multiple turns grow beyond the initially configured limit.

This PR addresses the immediate assertion failure. Note that the max_tokens and max_model_len assignments should be revisited in a future PR to properly handle dynamic prompt length requirements in multi-turn scenarios.

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: ...
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward
- If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)

…ength in multi-turn scenario

gemini-code-assist

Code Review

This pull request fixes an assertion failure in multi-turn scenarios by removing the assertion that checks if max_tokens is within the calculated response_length. While this solves the immediate crash, it introduces a risk of passing invalid max_tokens values to the backend generation engines, which could lead to other failures. My review suggests a more robust fix by replacing the removed assertions with logic that caps the max_tokens value to a safe, non-negative range. This ensures stability while still addressing the original problem.

I am having trouble creating individual review comments. Click here to see my feedback.

verl/workers/rollout/sglang_rollout/async_sglang_server.py (274-276)

Simply removing the assertion is risky. If a user provides max_new_tokens larger than the available response_length, it could cause the backend generation engine to fail, especially since response_length can be negative if the prompt is too long.

A safer approach is to cap max_new_tokens to the valid range (i.e., between 0 and response_length). This handles both cases: when max_new_tokens is user-provided and might be too large, and when response_length itself is negative due to a long prompt.

        max_new_tokens = max(0, min(max_new_tokens, response_length))

verl/workers/rollout/vllm_rollout/vllm_async_server.py (473-475)

Simply removing the assertion is risky. If a user provides max_tokens larger than the available response_length, it could cause the backend vLLM engine to fail, especially since response_length can be negative if the prompt is too long.

A safer approach is to cap max_tokens to the valid range (i.e., between 0 and response_length). This handles both cases: when max_tokens is user-provided and might be too large, and when response_length itself is negative due to a long prompt.

        max_tokens = max(0, min(max_tokens, response_length))

…sponse_length in multi-turn scenario" (#4687) Reverts #4668

…ength in multi-turn scenario (verl-project#4668) ### What does this PR do? In multi-turn conversations, the length of prompts_id can exceed the original max_prompt_length, causing the current assertion to fail. This occurs because the accumulated prompt tokens from multiple turns grow beyond the initially configured limit. This PR addresses the immediate assertion failure. Note that the max_tokens and max_model_len assignments should be revisited in a future PR to properly handle dynamic prompt length requirements in multi-turn scenarios. ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

…sponse_length in multi-turn scenario" (verl-project#4687) Reverts verl-project#4668

…ength in multi-turn scenario (verl-project#4668) ### What does this PR do? In multi-turn conversations, the length of prompts_id can exceed the original max_prompt_length, causing the current assertion to fail. This occurs because the accumulated prompt tokens from multiple turns grow beyond the initially configured limit. This PR addresses the immediate assertion failure. Note that the max_tokens and max_model_len assignments should be revisited in a future PR to properly handle dynamic prompt length requirements in multi-turn scenarios. ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

…sponse_length in multi-turn scenario" (verl-project#4687) Reverts verl-project#4668

…ength in multi-turn scenario (verl-project#4668) ### What does this PR do? In multi-turn conversations, the length of prompts_id can exceed the original max_prompt_length, causing the current assertion to fail. This occurs because the accumulated prompt tokens from multiple turns grow beyond the initially configured limit. This PR addresses the immediate assertion failure. Note that the max_tokens and max_model_len assignments should be revisited in a future PR to properly handle dynamic prompt length requirements in multi-turn scenarios. ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

…sponse_length in multi-turn scenario" (verl-project#4687) Reverts verl-project#4668

…ength in multi-turn scenario (verl-project#4668) ### What does this PR do? In multi-turn conversations, the length of prompts_id can exceed the original max_prompt_length, causing the current assertion to fail. This occurs because the accumulated prompt tokens from multiple turns grow beyond the initially configured limit. This PR addresses the immediate assertion failure. Note that the max_tokens and max_model_len assignments should be revisited in a future PR to properly handle dynamic prompt length requirements in multi-turn scenarios. ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

…sponse_length in multi-turn scenario" (verl-project#4687) Reverts verl-project#4668

…sponse_length in multi-turn scenario" (#4687) Reverts verl-project/verl#4668

[rollout] fix: delete problematic assert for max_tokens <= response_l…

be6cdbc

…ength in multi-turn scenario

PeterSH6 requested review from tongyx361 and vermouth1992 December 25, 2025 08:29

PeterSH6 requested review from SwordFaith, chenhaiq, wuxibin89 and zhaochenyang20 as code owners December 25, 2025 08:29

gemini-code-assist bot reviewed Dec 25, 2025

View reviewed changes

tongyx361 approved these changes Dec 25, 2025

View reviewed changes

PeterSH6 merged commit 73be449 into verl-project:main Dec 25, 2025
85 of 93 checks passed

PeterSH6 deleted the gm/fix_assert branch December 25, 2025 14:22

vermouth1992 mentioned this pull request Dec 27, 2025

Revert "[rollout] fix: delete problematic assert for max_tokens <= response_length in multi-turn scenario" #4687

Merged

PeterSH6 pushed a commit that referenced this pull request Dec 27, 2025

Revert "[rollout] fix: delete problematic assert for max_tokens <= re…

d1c2d3f

…sponse_length in multi-turn scenario" (#4687) Reverts #4668

boren-ms pushed a commit to boren-ms/verl that referenced this pull request Dec 30, 2025

Revert "[rollout] fix: delete problematic assert for max_tokens <= re…

6713d93

…sponse_length in multi-turn scenario" (verl-project#4687) Reverts verl-project#4668

jsfanfanfan pushed a commit to meituan-search/verl that referenced this pull request Jan 9, 2026

Revert "[rollout] fix: delete problematic assert for max_tokens <= re…

eda4b86

…sponse_length in multi-turn scenario" (verl-project#4687) Reverts verl-project#4668

vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026

Revert "[rollout] fix: delete problematic assert for max_tokens <= re…

889474e

…sponse_length in multi-turn scenario" (verl-project#4687) Reverts verl-project#4668

sophiayyya pushed a commit to sophiayyya/verl that referenced this pull request Jan 25, 2026

Revert "[rollout] fix: delete problematic assert for max_tokens <= re…

176fcb3

…sponse_length in multi-turn scenario" (verl-project#4687) Reverts verl-project#4668

y-a23 pushed a commit to y-a23/query that referenced this pull request Feb 5, 2026

Revert "[rollout] fix: delete problematic assert for max_tokens <= re…

6d8c843

…sponse_length in multi-turn scenario" (#4687) Reverts verl-project/verl#4668

KimperYang pushed a commit to KimperYang/TauVerl that referenced this pull request Mar 3, 2026

Revert "[rollout] fix: delete problematic assert for max_tokens <= re…

c18a4e8

…sponse_length in multi-turn scenario" (#4687) Reverts verl-project/verl#4668

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rollout] fix: delete problematic assert for max_tokens <= response_length in multi-turn scenario#4668

[rollout] fix: delete problematic assert for max_tokens <= response_length in multi-turn scenario#4668
PeterSH6 merged 1 commit intoverl-project:mainfrom
PeterSH6:gm/fix_assert

PeterSH6 commented Dec 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

PeterSH6 commented Dec 25, 2025

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

verl/workers/rollout/sglang_rollout/async_sglang_server.py (274-276)

verl/workers/rollout/vllm_rollout/vllm_async_server.py (473-475)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants