Skip to content

chore: fix grpo functional test metric#1643

Merged
terrykong merged 1 commit intomainfrom
ruit/fix_grpo_functional_test
Dec 16, 2025
Merged

chore: fix grpo functional test metric#1643
terrykong merged 1 commit intomainfrom
ruit/fix_grpo_functional_test

Conversation

@RayenTian
Copy link
Contributor

@RayenTian RayenTian commented Dec 16, 2025

What does this PR do ?

Reduce the functional test acceptance threshold for train/gen_kl_error from 1.05 to 0.001 to catch subtle regressions in generation logprob/KL alignment early.

Related Context

Link: #1506 (comment)
gen_kl_error should actually be pretty low, usually we set it to be <1e-3.

Summary by CodeRabbit

  • Tests
    • Updated validation thresholds for training metrics to enforce stricter quality standards during testing procedures.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: ruit <ruit@nvidia.com>
@RayenTian RayenTian requested a review from yuki-97 December 16, 2025 11:26
@RayenTian RayenTian marked this pull request as ready for review December 16, 2025 11:31
@RayenTian RayenTian requested a review from a team as a code owner December 16, 2025 11:31
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

📝 Walkthrough

Walkthrough

A test threshold expectation is adjusted in the GRPO test suite, reducing the maximum tolerance for the train/gen_kl_error metric from 1.05 to 0.001, enforcing stricter validation criteria during test execution.

Changes

Cohort / File(s) Summary
Test threshold adjustment
tests/functional/grpo.sh, tests/check_metrics.py
Reduced max(data["train/gen_kl_error"]) threshold from 1.05 to 0.001 for stricter metric validation

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

  • Simple numeric threshold modification with no logic or control flow changes
  • Verify that the new threshold is intentional and achievable under test conditions

Suggested labels

CI:L1

Suggested reviewers

  • yuki-97
  • joyang-nv

Pre-merge checks and finishing touches

✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'chore: fix grpo functional test metric' directly describes the main change: adjusting the functional test acceptance threshold for a GRPO metric.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Test Results For Major Changes ✅ Passed Changes are minor test threshold adjustments in functional test suite, not production code modifications. No new features, breaking changes, or significant refactoring introduced.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ruit/fix_grpo_functional_test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@RayenTian RayenTian added the CI:L1 Run doctests, unit tests, and functional tests label Dec 16, 2025
@yuki-97 yuki-97 requested a review from terrykong December 16, 2025 11:46
@terrykong terrykong merged commit 52cebdf into main Dec 16, 2025
48 of 49 checks passed
@terrykong terrykong deleted the ruit/fix_grpo_functional_test branch December 16, 2025 18:42
DeL-TaiseiOzaki pushed a commit to DeL-TaiseiOzaki/RL that referenced this pull request Jan 8, 2026
Signed-off-by: ruit <ruit@nvidia.com>
parthmannan pushed a commit to parthmannan/RL that referenced this pull request Jan 15, 2026
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: Parth Mannan <pmannan@nvidia.com>
yuanhangsu1986 pushed a commit to yuanhangsu1986/RL-Nemontron-Edge-Omni that referenced this pull request Feb 12, 2026
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
yuanhangsu1986 pushed a commit to yuanhangsu1986/RL-Nemontron-Edge-Omni that referenced this pull request Feb 21, 2026
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
seonjinn pushed a commit that referenced this pull request Mar 8, 2026
Signed-off-by: ruit <ruit@nvidia.com>
seonjinn pushed a commit that referenced this pull request Mar 8, 2026
Signed-off-by: ruit <ruit@nvidia.com>
seonjinn pushed a commit that referenced this pull request Mar 9, 2026
Signed-off-by: ruit <ruit@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants