[rollout] feat: partial rollout part1 - rollout batch dynamic adjustment#2101
Closed
chenhaiq wants to merge 4 commits intoverl-project:mainfrom
Closed
[rollout] feat: partial rollout part1 - rollout batch dynamic adjustment#2101chenhaiq wants to merge 4 commits intoverl-project:mainfrom
chenhaiq wants to merge 4 commits intoverl-project:mainfrom
Conversation
Fixed regression from: - verl-project#1668 - verl-project#1933 Added e2e test for both sglang and vllm async mode test
Collaborator
Author
|
move partial rollout with streaming to #2200 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Partial rollout need the ability to postpone unfinished samples to later batches, and prefetch samples from later batches.
This PR make DataProto adjustable between batches, and provides an extension point for who want more advanced policy to adjust the batch. The main idea is to change DataProto to a linked list, so that the chat scheduler can access data from later batch and move samples between them.
For example:
postpone unfinished samples to next batch is the most straight idea to achieve partial rollout,
however, someone may think moving those unfinished samples to the last batch in the epoch is a better idea.
The new LinkedDataProto class provides there methods that can be override in subclass for customization:
Notes: this PR does not change any behavior. It will work with part2 PR in chat scheduler together to enable partial rollout
The reason to submit this PR is to receive feedback before the part2 PR is completed.
Test
tests/test_protocol_on_cpu.py