cp: feat: add dapo recipe and test (1617) into r0.5.0#1687
Conversation
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
📝 WalkthroughWalkthroughA new DAPO DeepSeek v3 performance configuration and test script are introduced for a 64-node, 8-GPU setup. The YAML defines training parameters, model settings, loss functions, and cluster topology. A corresponding Bash test script launches the experiment and validates training metrics. The test suite manifest is updated to register the new test. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🧰 Additional context used📓 Path-based instructions (5)examples/configs/recipes/**/*.yaml📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
!(**/tests/**|**/test_*.py|**/test_*.sh)📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
**/*.sh📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
tests/test_suites/**/*.sh📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
**/*.{py,sh}📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
🧠 Learnings (5)📓 Common learnings📚 Learning: 2025-11-24T17:24:41.976ZApplied to files:
📚 Learning: 2025-10-12T14:46:57.171ZApplied to files:
📚 Learning: 2025-10-12T14:46:55.513ZApplied to files:
📚 Learning: 2025-09-19T07:28:29.887ZApplied to files:
🪛 Shellcheck (0.11.0)tests/test_suites/llm/performance/dapo-deepseek-v3-64n8g.sh[warning] 14-14: NUM_RUNS appears unused. Verify use (or export if used externally). (SC2034) [warning] 15-15: NUM_MINUTES appears unused. Verify use (or export if used externally). (SC2034) [warning] 21-21: Use 'cd ... || exit' or 'cd ... || return' in case cd fails. (SC2164) [error] 36-36: Double quote array expansions to avoid re-splitting elements. (SC2068) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
beep boop [🤖]: Hi @ZhiyuLi-Nvidia 👋,
Summary by CodeRabbit
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.