fix(trainer): supplement dfed770 by adding missing update_weights in …#469
Merged
kylemontgomery1 merged 2 commits intorllm-org:mainfrom Apr 4, 2026
Merged
Conversation
…sdk trainer to fix vllm engine weight loss and Ascend PositionEmbedding OOB error
Collaborator
|
@MarkJoson Can you remove the dashboard code and just leave the changes |
Contributor
Author
Sorry about that — I accidentally committed/pushed to the wrong branch and it included some dashboard-related changes. I’ll clean this up and update the PR so it only contains the changes to rllm/trainer/ver1/agent_sdk_trainer.py. |
d6b90b4 to
5b54789
Compare
Contributor
Author
I removed the dashboard changes and added two extra fixes to round out the original change. The PR now only touches rllm/trainer/ver1/agent_sdk_trainer.py. |
Collaborator
|
Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…sdk trainer to fix vllm engine weight loss and Ascend PositionEmbedding OOB error
Summary
🐛 Bug Description This MR supplements last week's commit by Star Li (dfed770). In the agent_sdk_trainer, the synchronization of weights to the vLLM rollout engine was missing after the initial checkpoint load. This omission caused the vLLM rollout engine to lose its model weights at startup. At the lower execution level, particularly on the Ascend platform, this misalignment formally manifested as an Out-of-Bounds (OOB) error during the PositionEmbedding operator calculation.
🛠️ Fix Implemented Explicitly added self.checkpoint_manager.update_weights() immediately following self._load_checkpoint() during the initialization phase in rllm/trainer/verl/agent_sdk_trainer.py. This ensures that the rollout engine correctly receives and acts on the latest model weights before the initial val_before_train and subsequent trajectory generation steps.
🔗 Related
Follows up on commit: dfed770
Type of change
What changed
rllm/trainer/verl/agent_sdk_trainer.py