Skip to content

cp: Dsv3 Recipe Update (2152) into r0.3.0#2186

Merged
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2152-r0.3.0
Feb 3, 2026
Merged

cp: Dsv3 Recipe Update (2152) into r0.3.0#2186
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2152-r0.3.0

Conversation

@ko3n1g
Copy link
Copy Markdown
Contributor

@ko3n1g ko3n1g commented Feb 3, 2026

beep boop [🤖]: Hi @dingqingy-nv 👋,

we've cherry picked #2152 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • New Features

    • Added public aliases for DeepSeek V3 pretraining configurations supporting multiple precision formats (BF16, FP8 variants).
  • Chores

    • Updated pretraining configurations with optimized parallelism and layout settings for improved performance.
    • Added environment configuration support for FP8 mixed precision on DeepSeek models.

Signed-off-by: Dingqing Yang <dingqingy@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 3, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ko3n1g
Copy link
Copy Markdown
Contributor Author

ko3n1g commented Feb 3, 2026

/ok to test f972a84

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 3, 2026

📝 Walkthrough

Walkthrough

Updated DeepSeek pretraining configurations for GB200 and GB300 hardware variants by propagating layout parameters, adjusting GB300_V1 parallelism settings (pipeline and virtual pipeline sizes, expert model parallel reduction), micro-batching, and recompute modules. Added public aliases for configuration variants and environment variable handling for DeepSeek fp8_mx compute dtype.

Changes

Cohort / File(s) Summary
DeepSeek Config Updates
scripts/performance/configs/deepseek/deepseek_llm_pretrain.py, scripts/performance/configs/deepseek/deepseek_workload_base_configs.py
Updated GB200 layout parameter from None to base_cfg.pp_layout. Revised GB300_V1 pretraining config with adjusted pipeline/virtual pipeline sizes (2/8), micro_batch_size (2), pp_layout, expert_model_parallel_size (64→32), and recompute_modules (["moe_act"]["mla_up_proj"]). Added public aliases for GB300 variants (BF16_V1, FP8_CS_V1, FP8_MX_V1, NVFP4_V1/V2).
Environment Setup
scripts/performance/perf_plugins.py
Added conditional branch to set del_cudnn_ln=False for DeepSeek model family with fp8_mx compute dtype in _set_model_specific_environment_variables.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Suggested labels

r0.3.0

Suggested reviewers

  • erhoo82
🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR modifies critical DeepSeek training configuration parameters (pipeline parallelism, expert parallelism, recompute modules) with high code review effort, but lacks test results, performance metrics, or convergence validation. Include test results demonstrating no convergence regression and performance metrics for the configuration changes, or reference testing from original PR #2152.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title indicates this is a cherry-pick of PR #2152 into branch r0.3.0, which directly corresponds to the changes shown in the raw summary (Deepseek V3 recipe updates to configuration files).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cherry-pick-2152-r0.3.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ko3n1g ko3n1g merged commit 7f61c05 into r0.3.0 Feb 3, 2026
49 checks passed
@ko3n1g ko3n1g deleted the cherry-pick-2152-r0.3.0 branch February 3, 2026 20:48
ko3n1g added a commit that referenced this pull request Feb 7, 2026
ko3n1g added a commit that referenced this pull request Feb 7, 2026
ko3n1g added a commit that referenced this pull request Feb 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants