Skip to content

cp: qwen gbs 2x (2280) into r0.3.0#2369

Merged
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2280-r0.3.0
Feb 13, 2026
Merged

cp: qwen gbs 2x (2280) into r0.3.0#2369
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2280-r0.3.0

Conversation

@ko3n1g
Copy link
Copy Markdown
Contributor

@ko3n1g ko3n1g commented Feb 13, 2026

beep boop [🤖]: Hi @malay-nagda 👋,

we've cherry picked #2280 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • Chores
    • Optimized training configuration parameters for improved performance.

Signed-off-by: Malay Nagda <malayn@nvidia.com>
Co-authored-by: Raghav Hrishikeshan Mukundan <102543536+rhmukundan@users.noreply.github.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@ko3n1g
Copy link
Copy Markdown
Contributor Author

ko3n1g commented Feb 13, 2026

/ok to test 623509c

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 13, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 13, 2026

📝 Walkthrough

Walkthrough

The change adds global_batch_size=1024 and moe_flex_dispatcher_backend="deepep" configuration parameters to two QWEN3 30B A3B pretrain configurations (BF16 and FP8 variants) for H100 GPUs.

Changes

Cohort / File(s) Summary
QWEN3 Configuration Updates
scripts/performance/configs/qwen/qwen3_workload_base_configs.py
Added global_batch_size=1024 and moe_flex_dispatcher_backend="deepep" to both QWEN3_30B_A3B_PRETRAIN_CONFIG_H100_BF16_V1 and QWEN3_30B_A3B_PRETRAIN_CONFIG_H100_FP8_CS_V1 replacement configs.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

  • PR #2351 — Adds moe_flex_dispatcher_backend="deepep" to the same QWEN3 30B A3B configurations.
  • PR #2280 — Adds global_batch_size=1024 to the same two QWEN3 30B A3B pretrain config entries.

Suggested labels

r0.3.0

Suggested reviewers

  • malay-nagda
🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR changes QWEN3 configuration parameters (global_batch_size to 1024, adds deepep MOE dispatcher) without documenting any test results, performance metrics, or convergence validation. Add performance benchmarks, convergence validation results, and hardware context to PR description to validate configuration changes.
Title check ❓ Inconclusive The title references a cherry-pick from PR #2280 (qwen gbs 2x) into r0.3.0 branch, which aligns with the changes that add global_batch_size=1024 configurations to QWEN3 configs, but uses an abbreviated, non-descriptive format that doesn't clearly convey the specific changes being made. Consider using a more descriptive title such as 'Add global_batch_size=1024 and moe_flex_dispatcher_backend configurations to QWEN3 pretrain configs' to clearly communicate the actual changes without relying on issue numbers or abbreviated notation.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into r0.3.0

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cherry-pick-2280-r0.3.0

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ko3n1g
Copy link
Copy Markdown
Contributor Author

ko3n1g commented Feb 13, 2026

We updated golden values already, so merging this

@ko3n1g ko3n1g merged commit 194331e into r0.3.0 Feb 13, 2026
23 of 25 checks passed
@ko3n1g ko3n1g deleted the cherry-pick-2280-r0.3.0 branch February 13, 2026 13:13
@coderabbitai coderabbitai bot mentioned this pull request Feb 16, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants