fix: all2all for qwen3-next H100 by ko3n1g · Pull Request #2479 · NVIDIA-NeMo/Megatron-Bridge

ko3n1g · 2026-02-22T20:50:15Z

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

Add specific line by line info of high level changes in this PR.

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

Summary by CodeRabbit

Chores
- Updated model configuration to optimize token distribution for H100 hardware variants, enhancing efficiency during large-scale pretraining operations.

Signed-off-by: oliver könig <okoenig@nvidia.com>

coderabbitai · 2026-02-22T20:54:22Z

📝 Walkthrough

Walkthrough

A configuration parameter cfg.model.moe_token_dispatcher_type is set to "alltoall" for the H100 variant of the qwen3_next_80b_a3b pretrain configuration. This adds a single line to an existing configuration file.

Changes

Cohort / File(s)	Summary
Qwen3 Pretrain Configuration `scripts/performance/configs/qwen/qwen3_llm_pretrain.py`	Added moe_token_dispatcher_type assignment set to "alltoall" for H100 variant configuration path.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

fix: perf configs 2 #2393: Modifies the same config field cfg.model.moe_token_dispatcher_type in the same file but sets it to "flex" for multiple Qwen3 variants, indicating related MOE dispatcher configuration updates.

Suggested reviewers

yaoyu-33
malay-nagda

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: all2all for qwen3-next H100' directly matches the main change: adding the 'alltoall' moe_token_dispatcher_type configuration for the H100 variant of qwen3-next.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Test Results For Major Changes	✅ Passed	Single-line configuration fix aligning H100 variant with existing patterns in related configurations; minor changes do not require test result documentation per check criteria.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ko3n1g/fix/qwen3-next

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

scripts/performance/configs/qwen/qwen3_llm_pretrain.py (1)
337-344: Explicitly set moe_token_dispatcher_type for all qwen3_next_80b_a3b GPU variants for consistency.

All qwen3_235b_a22b and qwen3_30b_a3b per-GPU configs explicitly set moe_token_dispatcher_type. The four qwen3_next_80b_a3b variants (gb200, b300, b200, gb300) silently rely on the default "alltoall" from qwen3_next_80b_a3b_pretrain_config(). While the H100 variant does override it explicitly, adding explicit assignments to the other four variants improves consistency, auditability, and makes platform-specific choices visible at the config level.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/performance/configs/qwen/qwen3_llm_pretrain.py` around lines 337 -
344, The qwen3_next_80b_a3b GPU variant configs should explicitly set
cfg.moe_token_dispatcher_type instead of relying on the default from
qwen3_next_80b_a3b_pretrain_config(); update each qwen3_next_80b_a3b variant
function (the gb200, b300, b200, gb300 variants that build cfg via
qwen3_next_80b_a3b_pretrain_config()) to assign cfg.moe_token_dispatcher_type =
"<appropriate_dispatcher>" (use the same string used by the H100 variant or the
intended dispatcher like "alltoall" or whichever platform-specific choice is
required) before calling set_qwen3_next_common_configs(cfg) /
set_workload_base_configs(cfg, base_cfg) and returning cfg so the dispatcher
choice is explicit and consistent across all variants.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/performance/configs/qwen/qwen3_llm_pretrain.py`:
- Around line 337-344: The qwen3_next_80b_a3b GPU variant configs should
explicitly set cfg.moe_token_dispatcher_type instead of relying on the default
from qwen3_next_80b_a3b_pretrain_config(); update each qwen3_next_80b_a3b
variant function (the gb200, b300, b200, gb300 variants that build cfg via
qwen3_next_80b_a3b_pretrain_config()) to assign cfg.moe_token_dispatcher_type =
"<appropriate_dispatcher>" (use the same string used by the H100 variant or the
intended dispatcher like "alltoall" or whichever platform-specific choice is
required) before calling set_qwen3_next_common_configs(cfg) /
set_workload_base_configs(cfg, base_cfg) and returning cfg so the dispatcher
choice is explicit and consistent across all variants.

Signed-off-by: oliver könig <okoenig@nvidia.com>

Signed-off-by: oliver könig <okoenig@nvidia.com> Signed-off-by: pengdurice <pengduhit@gmail.com>

Signed-off-by: oliver könig <okoenig@nvidia.com>

fix: all2all for qwen3-next H100

e386b2d

Signed-off-by: oliver könig <okoenig@nvidia.com>

ko3n1g requested review from malay-nagda and yaoyu-33 February 22, 2026 20:50

copy-pr-bot bot temporarily deployed to test February 22, 2026 20:51 Inactive

coderabbitai bot reviewed Feb 22, 2026

View reviewed changes

copy-pr-bot bot temporarily deployed to nemo-ci February 22, 2026 20:59 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 22, 2026 21:08 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 22, 2026 21:18 Inactive

malay-nagda approved these changes Feb 23, 2026

View reviewed changes

ko3n1g added the r0.3.0 Cherry-pick label for r0.3.0 release branch label Feb 23, 2026

ko3n1g merged commit b9c6b53 into main Feb 23, 2026
56 checks passed

ko3n1g deleted the ko3n1g/fix/qwen3-next branch February 23, 2026 08:08

ko3n1g added a commit that referenced this pull request Feb 24, 2026

fix: all2all for qwen3-next H100 (#2479)

cb434ff

Signed-off-by: oliver könig <okoenig@nvidia.com>

ko3n1g mentioned this pull request Feb 24, 2026

260201: Cherrypick various changes #2509

Merged

5 tasks

pengdurice pushed a commit to pengdurice/Megatron-Bridge that referenced this pull request Feb 24, 2026

fix: all2all for qwen3-next H100 (NVIDIA-NeMo#2479)

a84199f

Signed-off-by: oliver könig <okoenig@nvidia.com> Signed-off-by: pengdurice <pengduhit@gmail.com>

coderabbitai bot mentioned this pull request Mar 3, 2026

Misc: Remaining cherry-picks for 26.02.01 #2631

Merged

5 tasks

copy-pr-bot bot pushed a commit that referenced this pull request Mar 19, 2026

fix: all2all for qwen3-next H100 (#2479)

64eadb0

Signed-off-by: oliver könig <okoenig@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: all2all for qwen3-next H100#2479

fix: all2all for qwen3-next H100#2479
ko3n1g merged 1 commit intomainfrom
ko3n1g/fix/qwen3-next

ko3n1g commented Feb 22, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 22, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ko3n1g commented Feb 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 22, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ko3n1g commented Feb 22, 2026 •

edited by coderabbitai bot

Loading