Add MoE config for Super B200 TP2 by shaharmor98 · Pull Request #33510 · vllm-project/vllm

shaharmor98 · 2026-02-01T12:38:56Z

When locally running Nemotron Super on B200 the following warning appears:

Using default MoE config. Performance might be sub-optimal!

I used the benchmark_moe.py to create a JSON file for this use-case:

python benchmarks/kernels/benchmark_moe.py \
  --model $MODEL_PATH \
  --trust-remote-code \
  --tp-size 2 \
  --tune \
  --batch-size 1 2 4 8 16 24 32 48 64 96 128 256 512 768 1024 1536 \
  --save-dir /.../vllm/model_executor/layers/fused_moe/configs/

Related PRs:
#27967

Test Plan

Compare performance (vllm bench serve) with various batch sizes, with and without the JSON file.

Performance should be equal or better when the JSON is available.

Test Result

Absolute output tokens per second cannot be disclosed at this stage.
Instead, we'd report the gained diff.

Setup for all benchmarks: B200, TP2

Concurrency (Batch Size)	Difference
16	+1.6%
64	+4.1%
128	+9.3%
512	+24.9%

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Shahar Mor <smor@nvidia.com>

dosubot · 2026-02-01T12:39:05Z

Related Documentation

No published documentation to review for changes on this repository.

Write your first living document

^{How did I do? Any feedback?}

gemini-code-assist

Code Review

This pull request introduces a new Mixture of Experts (MoE) configuration file for the NVIDIA B200 GPU with a tensor parallelism size of 2. The configuration is generated by the project's benchmarking script and aims to optimize performance for MoE models on this specific hardware. The provided performance metrics show a significant improvement with the new configuration. The change is straightforward and appears to be a valuable performance enhancement.

mgoin

LGTM, thanks for including benchmarks!

Signed-off-by: Pai <416932041@qq.com>

add MoE config for B200

02a94c9

Signed-off-by: Shahar Mor <smor@nvidia.com>

shaharmor98 requested review from mgoin and pavanimajety as code owners February 1, 2026 12:38

gemini-code-assist Bot reviewed Feb 1, 2026

View reviewed changes

mgoin approved these changes Feb 1, 2026

View reviewed changes

mgoin enabled auto-merge (squash) February 1, 2026 15:42

github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 1, 2026

mgoin merged commit 8869cd8 into vllm-project:main Feb 1, 2026
50 checks passed

PiratePai pushed a commit to PiratePai/epd_shm that referenced this pull request Feb 3, 2026

Add MoE config for Super B200 TP2 (vllm-project#33510)

85de818

Signed-off-by: Pai <416932041@qq.com>

danisereb mentioned this pull request Feb 12, 2026

Add config file for fused MoE for Nemotron (TP4, B200) #34411

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add MoE config for Super B200 TP2#33510

Add MoE config for Super B200 TP2#33510
mgoin merged 1 commit intovllm-project:mainfrom
shaharmor98:feat/add-super-moe-config

shaharmor98 commented Feb 1, 2026 •

edited by github-actions Bot

Loading

Uh oh!

dosubot Bot commented Feb 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

mgoin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shaharmor98 commented Feb 1, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Plan

Test Result

Uh oh!

dosubot Bot commented Feb 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shaharmor98 commented Feb 1, 2026 •

edited by github-actions Bot

Loading