Conversation
Signed-off-by: Terry Kong <terryk@nvidia.com>
📝 WalkthroughWalkthroughThis PR introduces a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nemo_rl/algorithms/sft.py (1)
391-475:⚠️ Potential issue | 🟠 MajorFail fast if
val_at_endis enabled without a validation dataloader.Right now
val_at_endcan be true whileval_dataloaderis None, which silently skips validation and defeats the feature (no final val metrics). Add a guard when any validation flag is enabled.🛡️ Suggested guard
val_period = sft_config["val_period"] val_at_start = sft_config["val_at_start"] val_at_end = sft_config["val_at_end"] max_num_epochs = sft_config["max_num_epochs"] + + if (val_period > 0 or val_at_start or val_at_end) and val_dataloader is None: + raise AssertionError( + "Validation is enabled but no validation dataset/dataloader was provided." + )
🤖 Fix all issues with AI agents
In `@nemo_rl/algorithms/distillation.py`:
- Around line 83-85: Update the inline doc for DistillationConfig.val_at_end to
explicitly state the type (bool), allowed values (True/False), and the
recommended default (True) along with the short rationale ("run validation on
final training step so final checkpoint has validation metrics for
get_best_checkpoint_path()"); then update the exemplar YAMLs under
examples/configs/*.yaml to include val_at_end: true so the default is reflected
in examples. Reference: DistillationConfig and the val_at_end key when making
the edits.
In `@nemo_rl/algorithms/grpo.py`:
- Around line 128-130: The docstring for the GRPOConfig TypedDict is missing the
valid types/values and recommended default for the new key val_at_end; update
the inline comment next to val_at_end (in GRPOConfig) to state that it is a
boolean (True/False), explain its purpose (run validation on the final training
step so final checkpoint contains validation metrics used by
get_best_checkpoint_path()), and declare the recommended default (e.g., False or
True as decided); then update the exemplar YAMLs under examples/configs/*.yaml
to include val_at_end with that same default value so examples reflect the
documented default.
In `@nemo_rl/algorithms/sft.py`:
- Around line 73-75: The SFTConfig key val_at_end lacks a documented type/valid
values and a recommended default; update the SFTConfig TypedDict entry for
val_at_end to state it's a boolean (valid values: True/False), describe its
purpose briefly, and specify the recommended default (set default: False unless
your workflows require final-step validation), and then add val_at_end: false to
the exemplar YAMLs under examples/configs/*.yaml so examples reflect the
declared default; edit the inline comment next to val_at_end and the exemplar
YAMLs accordingly (reference symbol: val_at_end in SFTConfig).
🧹 Nitpick comments (3)
nemo_rl/utils/checkpoint.py (1)
251-255: Addstacklevel=2towarnings.warncalls.Both
warnings.warncalls (lines 251 and 258) should includestacklevel=2so the warning points to the caller rather than this internal method.🔧 Proposed fix
warnings.warn( f"Ignoring {ignored_count} checkpoint(s) at step(s) {ignored_steps} that do not have " f"metric '{self.metric_name}'. Consider enabling val_at_end or adjusting val_period " - f"to align with max_steps." + f"to align with max_steps.", + stacklevel=2, )And similarly for line 258:
warnings.warn( f"No checkpoints contain metric '{self.metric_name}'. Returning latest checkpoint. " - f"Consider enabling val_at_end or adjusting val_period to align with max_steps." + f"Consider enabling val_at_end or adjusting val_period to align with max_steps.", + stacklevel=2, )tests/unit/utils/test_checkpoint.py (2)
352-355: Consider removing unusedcheckpoint_dirfixture parameter.The
checkpoint_dirfixture is not directly used in this test. If it's needed to ensure the directory exists forcheckpoint_manager, the dependency is already handled through the fixture chain.🔧 Proposed fix
-def test_get_best_checkpoint_path_no_checkpoints(checkpoint_manager, checkpoint_dir): +def test_get_best_checkpoint_path_no_checkpoints(checkpoint_manager): """Test that get_best_checkpoint_path returns None when no checkpoints exist.""" result = checkpoint_manager.get_best_checkpoint_path() assert result is None
379-381: Consider addingstrict=Truetozip()calls.While the lists are hardcoded and known to be the same length, adding
strict=Trueis a good defensive practice that catches mismatches during test maintenance.🔧 Proposed fix for all three occurrences
Line 379:
- for step, training_info in zip(steps, training_infos): + for step, training_info in zip(steps, training_infos, strict=True):Line 421:
- for step, training_info in zip(steps, training_infos): + for step, training_info in zip(steps, training_infos, strict=True):Line 458:
- for step, acc in zip(steps, accuracies): + for step, acc in zip(steps, accuracies, strict=True):
Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Yuki Huang <yukih@nvidia.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Yuki Huang <yukih@nvidia.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Yuki Huang <yukih@nvidia.com>
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Issues
List issues that this PR closes (syntax):
closes #1415
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit
New Features
val_at_endconfiguration parameter across training algorithms, enabling validation execution at the end of training runs.Bug Fixes
Tests