Skip to content

cp: Update Deepseek V3 MXFP8 GB200 mapping (2215) into r0.3.0#2378

Merged
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2215-r0.3.0
Feb 14, 2026
Merged

cp: Update Deepseek V3 MXFP8 GB200 mapping (2215) into r0.3.0#2378
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2215-r0.3.0

Conversation

@ko3n1g
Copy link
Copy Markdown
Contributor

@ko3n1g ko3n1g commented Feb 13, 2026

beep boop [🤖]: Hi @dingqingy-nv 👋,

we've cherry picked #2215 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • Chores
    • Updated performance configuration settings for DeepSeek V3 model to optimize CUDA graph operations.

Signed-off-by: Dingqing Yang <dingqingy@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@ko3n1g
Copy link
Copy Markdown
Contributor Author

ko3n1g commented Feb 13, 2026

/ok to test 25915c0

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 13, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 13, 2026

📝 Walkthrough

Walkthrough

A single configuration field in DeepSeek V3 pretrain config for GB200 hardware was updated to include "attn" in the cuda_graph_scope list, expanding from ["moe_router", "moe_preprocess"] to ["attn", "moe_router", "moe_preprocess"].

Changes

Cohort / File(s) Summary
CUDA Graph Scope Configuration
scripts/performance/configs/deepseek/deepseek_workload_base_configs.py
Updated cuda_graph_scope in DEEPSEEK_V3_PRETRAIN_CONFIG_GB200_V1 to include "attn" alongside existing "moe_router" and "moe_preprocess" scopes.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Suggested reviewers

  • dingqingy-nv
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title mentions updating a Deepseek V3 GB200 mapping, which aligns with the actual change of modifying the DEEPSEEK_V3_PRETRAIN_CONFIG_GB200_V1 configuration, making it specific and relevant to the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into r0.3.0
Test Results For Major Changes ✅ Passed PR adds 'attn' to cuda_graph_scope in one config variable. This is a minor configuration adjustment, not a major change. Change is a cherry-pick of prior validated work.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cherry-pick-2215-r0.3.0

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ko3n1g ko3n1g merged commit d4ef39d into r0.3.0 Feb 14, 2026
48 of 52 checks passed
@ko3n1g ko3n1g deleted the cherry-pick-2215-r0.3.0 branch February 14, 2026 10:08
@ko3n1g
Copy link
Copy Markdown
Contributor Author

ko3n1g commented Feb 14, 2026

Thanks for testing this internally!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants