Skip to content

cp: fix: relax nanov3 nightly test metrics strict (1712) into r0.5.0#1713

Merged
terrykong merged 1 commit intor0.5.0from
cherry-pick-1712-r0.5.0
Jan 5, 2026
Merged

cp: fix: relax nanov3 nightly test metrics strict (1712) into r0.5.0#1713
terrykong merged 1 commit intor0.5.0from
cherry-pick-1712-r0.5.0

Conversation

@chtruong814
Copy link
Contributor

@chtruong814 chtruong814 commented Jan 5, 2026

beep boop [🤖]: Hi @RayenTian 👋,

we've cherry picked #1712 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • Tests
    • Updated validation thresholds for training loss metrics in test suites.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Walkthrough

These changes update train/loss metrics thresholds in two test scripts from their previous values to 2.05, ensuring test assertions align with current training behavior without modifying control flow.

Changes

Cohort / File(s) Summary
Test metrics threshold updates
tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.sh, tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2.sh
Updated train/loss at step 20 threshold to 2.05 in both test scripts (from 2.03 and 1.98 respectively). No control flow changes.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Suggested labels

r0.5.0

Suggested reviewers

  • terrykong

Pre-merge checks and finishing touches

✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: relaxing strict metrics thresholds in nanov3 nightly tests as part of a cherry-pick into r0.5.0 branch.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Test Results For Major Changes ✅ Passed PR contains minor adjustments to test metric thresholds in two test suite files without affecting training code or core functionality.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e7a2f7 and 6fc4d6f.

📒 Files selected for processing (2)
  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.sh
  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2.sh
🧰 Additional context used
📓 Path-based instructions (4)
**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.sh: Use uv run instead of python to execute scripts
Follow the Google Shell Style Guide for shell scripts

Files:

  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.sh
  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2.sh
tests/test_suites/**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

tests/test_suites/**/*.sh: When adding support for a new model, create a corresponding driver shell script under tests/test_suites/ in the matching domain
Driver shell scripts should match the YAML base name with .sh extension and invoke training entrypoint with uv run

Files:

  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.sh
  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2.sh
!(**/tests/**|**/test_*.py|**/test_*.sh)

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year

Files:

  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.sh
  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2.sh
**/*.{py,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

The NVIDIA copyright header should appear at the top of all Python files and shell scripts (excluding tests)

Files:

  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.sh
  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2.sh
🧠 Learnings (1)
📚 Learning: 2025-10-12T14:46:57.171Z
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:6-11
Timestamp: 2025-10-12T14:46:57.171Z
Learning: Test scripts in tests/test_suites/llm/ follow a standard configuration pattern that includes NUM_NODES, STEPS_PER_RUN, MAX_STEPS, NUM_RUNS (calculated as `$(( (MAX_STEPS + STEPS_PER_RUN - 1) / STEPS_PER_RUN ))`), and NUM_MINUTES. These variables are part of the test infrastructure's standard interface and should not be flagged as unused even if not directly referenced within the individual script, as they are consumed by external launch tooling or common.env.

Applied to files:

  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.sh
  • tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Lint check
  • GitHub Check: Lint check
  • GitHub Check: Lint check
  • GitHub Check: Post automodel integration comment / Comment on PR
  • GitHub Check: Post submodule check comment / Comment on PR
🔇 Additional comments (2)
tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2.sh (1)

36-36: LGTM! Threshold appropriately relaxed.

The train/loss threshold update to 2.05 aligns with the PR objective to reduce test flakiness while maintaining meaningful validation. The change is consistent with the companion update in the LoRA variant.

tests/test_suites/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.sh (1)

36-36: LGTM! Threshold unified with base variant.

The update to 2.05 creates consistency with the base FSDP2 test while accommodating expected training behavior variation. The change properly balances test reliability with validation strictness.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@terrykong terrykong added the CI:docs Run doctest label Jan 5, 2026
@terrykong terrykong enabled auto-merge (squash) January 5, 2026 18:12
@terrykong terrykong merged commit 6526fe9 into r0.5.0 Jan 5, 2026
66 of 69 checks passed
@terrykong terrykong deleted the cherry-pick-1712-r0.5.0 branch January 5, 2026 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants