cp: `DSv3 EP=8 for B200, PP8-VP2 for B300 BF16, Lm3.1 405B TP4-CP1 GB300 FP8-CS (2175)` into `r0.3.0` by ko3n1g · Pull Request #2198 · NVIDIA-NeMo/Megatron-Bridge

ko3n1g · 2026-02-03T18:14:47Z

beep boop [🤖]: Hi @malay-nagda 👋,

we've cherry picked #2175 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

Chores
- Optimized DeepSeek V3 pretraining configuration for GB300 hardware by adjusting expert and pipeline parallelism parameters.
- Updated LLaMA 3.1 pretraining configuration to refine tensor and context parallelism settings for improved training efficiency.

…P8-CS (#2175) Signed-off-by: Malay Nagda <malayn@nvidia.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>

ko3n1g · 2026-02-03T18:14:51Z

/ok to test 4937436

copy-pr-bot · 2026-02-03T18:14:51Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-03T18:18:28Z

📝 Walkthrough

Walkthrough

Updates configuration parameters for DeepSeek and Llama model pretraining setups. Reduces expert-model-parallelism for DeepSeek GB300 V1, restructures DeepSeek B300 BF16 V2 definition, and adjusts tensor and context parallelism for Llama31 405B FP8 GB300 configuration.

Changes

Cohort / File(s)	Summary
DeepSeek V3 Pretraining Configs `scripts/performance/configs/deepseek/deepseek_workload_base_configs.py`	Reduces expert_model_parallel_size from 16 to 8 in DEEPSEEK_V3_PRETRAIN_CONFIG_GB300_V1. Restructures DEEPSEEK_V3_PRETRAIN_CONFIG_B300_BF16_V2 from alias to explicit replace() with pipeline_model_parallel_size=8 and virtual_pipeline_model_parallel_size=2.
Llama31 Pretraining Configs `scripts/performance/configs/llama/llama31_workload_base_configs.py`	Updates LLAMA31_405B_PRETRAIN_CONFIG_GB300_FP8_CS_V2 to increase tensor_model_parallel_size from 2 to 4, reduce context_parallel_size from 2 to 1, and explicitly set use_megatron_fsdp=False and cpu_offloading_num_layers=None.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

erhoo82
sanandaraj5597

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR description lacks test results, performance benchmarks, convergence validation, or references to original PR `#2175` testing for critical parallelism parameter changes affecting training performance.	Update PR description with reference to PR `#2175` performance testing, before-and-after metrics for parallelism changes on B200/B300/GB300 hardware, and convergence validation confirmation.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title specifically describes the three main changes: DeepSeek V3 expert-model-parallelism reduction to 8, DeepSeek V3 B300 BF16 pipeline/virtual-pipeline configuration, and Llama 3.1 405B tensor-context parallelism changes, all aligned with the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch cherry-pick-2175-r0.3.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

DSv3 EP=8 for B200, PP8-VP2 for B300 BF16, Lm3.1 405B TP4-CP1 GB300 F…

4937436

…P8-CS (#2175) Signed-off-by: Malay Nagda <malayn@nvidia.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>

ko3n1g requested a review from malay-nagda February 3, 2026 18:14

ko3n1g added cherry-pick Run CICD labels Feb 3, 2026

copy-pr-bot bot temporarily deployed to nemo-ci February 3, 2026 18:15 Inactive

copy-pr-bot bot temporarily deployed to test February 3, 2026 18:15 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 3, 2026 19:50 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 3, 2026 19:56 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 3, 2026 20:05 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 3, 2026 20:05 Failure

copy-pr-bot bot temporarily deployed to nemo-ci February 3, 2026 20:05 Inactive

thomasdhc approved these changes Feb 3, 2026

View reviewed changes

ko3n1g merged commit 48a27fa into r0.3.0 Feb 3, 2026
45 of 47 checks passed

ko3n1g deleted the cherry-pick-2175-r0.3.0 branch February 3, 2026 20:48

This was referenced Feb 7, 2026

cp: b300 dsv3 bf16 hang fix (2260) into r0.3.0 #2270

Merged

Revert #2152 and 2209 #2271

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cp: `DSv3 EP=8 for B200, PP8-VP2 for B300 BF16, Lm3.1 405B TP4-CP1 GB300 FP8-CS (2175)` into `r0.3.0`#2198

cp: `DSv3 EP=8 for B200, PP8-VP2 for B300 BF16, Lm3.1 405B TP4-CP1 GB300 FP8-CS (2175)` into `r0.3.0`#2198
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2175-r0.3.0

ko3n1g commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

ko3n1g commented Feb 3, 2026

Uh oh!

copy-pr-bot bot commented Feb 3, 2026

Uh oh!

coderabbitai bot commented Feb 3, 2026

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ko3n1g commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

ko3n1g commented Feb 3, 2026

Uh oh!

copy-pr-bot bot commented Feb 3, 2026

Uh oh!

coderabbitai bot commented Feb 3, 2026

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ko3n1g commented Feb 3, 2026 •

edited by coderabbitai bot

Loading