Update Qwen3 30B H100 Base Configs with HybridEP by rhmukundan · Pull Request #2477 · NVIDIA-NeMo/Megatron-Bridge

rhmukundan · 2026-02-22T16:20:09Z

Summary by CodeRabbit

Chores
- Optimized Qwen3 model training configurations to enhance performance efficiency across different parallelism strategies.

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

copy-pr-bot · 2026-02-22T16:20:12Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-22T16:24:38Z

No actionable comments were generated in the recent review. 🎉

📝 Walkthrough

Walkthrough

Configuration updates for Qwen3 pretraining on H100/GB300: switching MoE token dispatcher type to "flex" in qwen3_llm_pretrain.py, and adjusting parallelism strategy from pipeline-virtual to expert-model-parallel in qwen3_workload_base_configs.py, with corresponding dispatcher backend and overlap settings changes.

Changes

Cohort / File(s)	Summary
Qwen3 MoE Token Dispatcher Type `scripts/performance/configs/qwen/qwen3_llm_pretrain.py`	Changed moe_token_dispatcher_type from "alltoall" to "flex" for qwen3_30b_a3b_pretrain_config_h100.
Qwen3 Workload Parallelism and MoE Backend `scripts/performance/configs/qwen/qwen3_workload_base_configs.py`	Replaced pipeline/virtual parallelism settings with expert_model_parallel_size=16, disabled moe_a2a_overlap, and switched moe_flex_dispatcher_backend from "deepep" to "hybridep" across Qwen3 30B A3B config blocks.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

fix: perf configs 2 #2393: Sets moe_token_dispatcher_type to "flex" for Qwen3 30B pretraining configs, directly overlapping with qwen3_llm_pretrain.py changes.
cp: Update Qwen3 235B A22B MXFP8 GB200/300 recipe and resolve NaN grad norm (2209) into r0.3.0 #2210: Modifies qwen3_workload_base_configs.py with identical model-parallel settings changes (removing virtual pipeline sizing and adjusting expert_model_parallel_size).
qwen gbs 2x #2280: Updates the same QWEN3_30B_A3B pretrain config blocks in qwen3_workload_base_configs.py, causing potential interaction with this PR's changes.

Suggested reviewers

ko3n1g
yaoyu-33

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	❓ Inconclusive	PR makes MoE dispatcher configuration changes but test results/performance validation documentation is not accessible in repository context.	Provide PR description from `#2477` including any test results, performance benchmarks, or convergence validation documentation for dispatcher backend and parallelism strategy changes.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main change: updating Qwen3 30B H100 configuration with HybridEP dispatcher backend, which is the primary modification across both configuration files.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch rmukundan/qwen3_30b_h100_config_update

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ko3n1g

Can you update golden values of the internal CI?

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com> Signed-off-by: oliver könig <okoenig@nvidia.com>

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com> Signed-off-by: pengdurice <pengduhit@gmail.com>

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

Update Qwen3 30B H100 Base Configs with HybridEP

4974a6b

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

rhmukundan requested review from ko3n1g and malay-nagda February 22, 2026 16:20

rhmukundan self-assigned this Feb 22, 2026

rhmukundan marked this pull request as ready for review February 22, 2026 16:20

Merge branch 'main' into rmukundan/qwen3_30b_h100_config_update

2335331

ko3n1g requested changes Feb 22, 2026

View reviewed changes

malay-nagda approved these changes Feb 23, 2026

View reviewed changes

Merge branch 'main' into rmukundan/qwen3_30b_h100_config_update

205a1f1

rhmukundan added the r0.3.0 Cherry-pick label for r0.3.0 release branch label Feb 23, 2026

yaoyu-33 approved these changes Feb 23, 2026

View reviewed changes

ko3n1g approved these changes Feb 23, 2026

View reviewed changes

ko3n1g merged commit 4a64507 into main Feb 23, 2026
2 checks passed

ko3n1g deleted the rmukundan/qwen3_30b_h100_config_update branch February 23, 2026 20:59

coderabbitai bot mentioned this pull request Feb 23, 2026

[perf,recipe] Fix Qwen3 30B A3B B200 perf config to use hybridep+flex dispatcher #2499

Merged

5 tasks

ko3n1g pushed a commit that referenced this pull request Feb 24, 2026

Update Qwen3 30B H100 Base Configs with HybridEP (#2477)

f29a269

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com> Signed-off-by: oliver könig <okoenig@nvidia.com>

ko3n1g mentioned this pull request Feb 24, 2026

260201: Cherrypick various changes #2509

Merged

5 tasks

This was referenced Mar 3, 2026

Misc: Remaining cherry-picks for 26.02.01 #2631

Merged

[model, training] fix: MoE checkpoint export with YaRN RoPE and flex dispatcher #2641

Open

Update Qwen3 235B B300 Configs to match Qwen3 B200 Configs #2669

Merged

rhmukundan mentioned this pull request Mar 18, 2026

[Test] H100 QWEN3 Performance tuning with HybridEP #2119

Open

copy-pr-bot bot pushed a commit that referenced this pull request Mar 19, 2026

Update Qwen3 30B H100 Base Configs with HybridEP (#2477)

b10d855

Signed-off-by: Raghav Hrishikeshan Mukundan <rmukundan@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Qwen3 30B H100 Base Configs with HybridEP#2477

Update Qwen3 30B H100 Base Configs with HybridEP#2477
ko3n1g merged 3 commits intomainfrom
rmukundan/qwen3_30b_h100_config_update

rhmukundan commented Feb 22, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 22, 2026

Uh oh!

coderabbitai bot commented Feb 22, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 inconclusive)

Uh oh!

ko3n1g left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

rhmukundan commented Feb 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Feb 22, 2026

Uh oh!

coderabbitai bot commented Feb 22, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 inconclusive)

Uh oh!

ko3n1g left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rhmukundan commented Feb 22, 2026 •

edited by coderabbitai bot

Loading