Skip to content

feat: GSPO#859

Merged
terrykong merged 32 commits intoNVIDIA-NeMo:mainfrom
pjin-nvidia:pjin/gspo-algo
Aug 23, 2025
Merged

feat: GSPO#859
terrykong merged 32 commits intoNVIDIA-NeMo:mainfrom
pjin-nvidia:pjin/gspo-algo

Conversation

@pjin-nvidia
Copy link
Contributor

@pjin-nvidia pjin-nvidia commented Aug 6, 2025

What does this PR do ?

This PR adds GSPO-style sequence-level importance ratios to the existing GRPO trainer.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

Training reward for an example deepscaler run (just swapping out GRPO for GSPO, no other hyperparam tuning):

gspo-part1-out gspo-part2-out gspo-part3-out gspo-part4-out gspo-part5-out gspo-part6-out gspo-part7-out gspo-part8-out

@pjin-nvidia pjin-nvidia marked this pull request as ready for review August 7, 2025 23:36
@ashors1
Copy link
Contributor

ashors1 commented Aug 11, 2025

LGTM, thank you! Before merging, would it be possible to add some convergence plots to the PR description?

@pjin-nvidia
Copy link
Contributor Author

thanks @ashors1 ! I added training reward from a deepscaler example run

@ashors1
Copy link
Contributor

ashors1 commented Aug 12, 2025

Sorry, could you label the plots as well? It's not clear to me what is being shown in each

@pjin-nvidia
Copy link
Contributor Author

added labels/titles to the example run convergence plots

@pjin-nvidia pjin-nvidia requested a review from ashors1 August 12, 2025 20:45
@pjin-nvidia pjin-nvidia requested a review from ashors1 August 14, 2025 21:26
ashors1
ashors1 previously approved these changes Aug 15, 2025
@ashors1
Copy link
Contributor

ashors1 commented Aug 15, 2025

@terrykong if the PR looks good to you, can we merge?

ashors1
ashors1 previously approved these changes Aug 15, 2025
ertkonuk and others added 11 commits August 19, 2025 14:41
Signed-off-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
ashors1
ashors1 previously approved these changes Aug 20, 2025
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
@pjin-nvidia pjin-nvidia requested a review from ashors1 August 21, 2025 00:26
Signed-off-by: Peter Jin <pjin@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Copy link
Contributor

@ashors1 ashors1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the unit tests!

@terrykong terrykong enabled auto-merge August 23, 2025 00:02
@terrykong terrykong added this pull request to the merge queue Aug 23, 2025
Merged via the queue into NVIDIA-NeMo:main with commit add4efb Aug 23, 2025
19 checks passed
terrykong added a commit that referenced this pull request Aug 24, 2025
This reverts commit add4efb.
chtruong814 added a commit that referenced this pull request Aug 24, 2025
This reverts commit add4efb.

Signed-off-by: Charlie Truong <chtruong@nvidia.com>
@terrykong terrykong mentioned this pull request Aug 24, 2025
terrykong added a commit that referenced this pull request Aug 24, 2025
This reverts commit add4efb.

Signed-off-by: Terry Kong <terryk@nvidia.com>
jveronvialard pushed a commit that referenced this pull request Aug 27, 2025
Signed-off-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Co-authored-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Julien Veron Vialard <jveronvialar@nvidia.com>
soodoshll pushed a commit to soodoshll/RL that referenced this pull request Aug 28, 2025
Signed-off-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Co-authored-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Qidong Su <qidongs@nvidia.com>
skirdey-inflection pushed a commit to skirdey-inflection/RL that referenced this pull request Aug 30, 2025
Signed-off-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Co-authored-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Stanislav Kirdey <stan@inflection.ai>
soodoshll pushed a commit to soodoshll/RL that referenced this pull request Sep 4, 2025
Signed-off-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Co-authored-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Qidong Su <qidongs@nvidia.com>
PrinsYin pushed a commit to PrinsYin/RL that referenced this pull request Nov 30, 2025
Signed-off-by: Tugrul Konuk <tkonuk@nvidia.com>
Signed-off-by: Peter Jin <pjin@nvidia.com>
Co-authored-by: Tugrul Konuk <tkonuk@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants