[fully_async] fix: replace routed_experts on partial rollout resume i…#6029
Merged
wuxibin89 merged 1 commit intoverl-project:mainfrom Apr 17, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates the generate function in agent_loop.py to simplify the handling of routed_experts. The logic was changed from concatenating expert routing data to directly assigning the latest output, as the underlying engine now provides the routing information for the full sequence. I have no feedback to provide.
Collaborator
|
Please use |
…nstead of concatenating sglang returns routed_experts for the full sequence including both input and output tokens (see io_struct.py comment on routed_experts field). When partial rollout resumes, the new call's input is prompt + already generated tokens, so sglang returns routed_experts covering all positions. The old code concatenated old and new routed_experts via torch.cat, which duplicated routing for prompt and previously generated tokens. This caused incorrect MoE expert replay in the actor and ppo_kl spikes. Fix: simply replace routed_experts since the resumed call's output already covers the entire sequence.
788affd to
18d467f
Compare
wuxibin89
reviewed
Apr 16, 2026
wuxibin89
requested changes
Apr 16, 2026
wuxibin89
approved these changes
Apr 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes a bug in
FullyAsyncLLMServerManager.generate()whererouted_expertswas incorrectly concatenated viatorch.catduring partial rollout resume, causing duplicated routing data and broken MoE expert replay in the actor.sglang returns
routed_expertsfor the full sequence (prompt + all generated tokens). Evidence from sglang source:io_struct.py#L1020— field definition:schedule_batch.py—seqlenused to collect routing covers the full sequence:topk.py#L1049-1051— capture is unconditional (no prefill/decode check):scheduler_output_processor_mixin.py#L105-111— collection uses fullseqlen:When partial rollout resumes after abort, the input becomes
prompt + already_generated_tokens. sglang re-processes the entire input during prefill and returnsrouted_expertscovering all positions. The old code concatenated this with the previousrouted_experts:This shifted the routing and caused incorrect MoE expert replay, leading to
actor/ppo_klspikes.Fix: replace
routed_expertsinstead of concatenating, since the resumed call already covers all positions.Related: #4348 (partial rollout RFC), #4101 (R3 router replay), #5344 (R3 in fully async)
Checklist Before Starting
[{modules}] {type}: {description}Test
partial_rollout=Trueandenable_rollout_routing_replay=True(R3 mode)actor/ppo_klno longer spikes after partial rollout resumerouted_expertstensor shape matches(prompt_len + response_len, num_layers, top_k)after resumeDesign & Code Changes
Single-line change in
verl/experimental/fully_async_policy/agent_loop/agent_loop.py:Checklist Before Submitting
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel.