[MoE Refactor] Remove MoE DP chunking by bnellnm · Pull Request #39107 · vllm-project/vllm

bnellnm · 2026-04-06T19:38:05Z

Purpose

Remove DP chunking MoE runner. Use max_num_batched_tokens as default for max_num_tokens in FusedMoEConfig.

Test Plan

CI
Ran DeepEP related tests/kernels/moe tests locally.

Test Result

cc @robertgshaw2-redhat , @tlrmchlsmth

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Bill Nell <bnell@redhat.com>

gemini-code-assist

Code Review

This pull request removes the VLLM_MOE_DP_CHUNK_SIZE and VLLM_ENABLE_MOE_DP_CHUNK environment variables, refactoring MoE chunking to rely on the scheduler's max_num_batched_tokens. It eliminates ChunkingMoERunner and simplifies related logic in the runner factory and shared experts. Feedback indicates that defaulting max_num_tokens to 0 in FusedMoEConfig causes an assertion failure if not explicitly set, which may break external integrations.

Signed-off-by: Bill Nell <bnell@redhat.com>

robertgshaw2-redhat · 2026-04-06T21:36:28Z

we should set the default max-num-batched-tokens to something smaller if we detect deepep-ll

Signed-off-by: Bill Nell <bnell@redhat.com>

…-dp-chunking

Signed-off-by: Bill Nell <bnell@redhat.com>

robertgshaw2-redhat · 2026-04-09T23:24:20Z

LGTM.

@elvircrn can you do a sanity check on gb?

robertgshaw2-redhat · 2026-04-09T23:25:52Z

shouldnt this also delete the ChunkingMoERunner file?

robertgshaw2-redhat · 2026-04-09T23:26:27Z

triggering a full CI run now

bnellnm · 2026-04-09T23:38:56Z

shouldnt this also delete the ChunkingMoERunner file?

I thought I did. Thanks for reminding me.

Signed-off-by: Bill Nell <bnell@redhat.com>

…-dp-chunking

mergify · 2026-04-10T04:33:20Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @bnellnm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Bill Nell <bnell@redhat.com>

This reverts commit e1e318a.

Signed-off-by: Bill Nell <bnell@redhat.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Signed-off-by: zengxian <xiangdong.zeng@intel.com>

Signed-off-by: Bill Nell <bnell@redhat.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>

Signed-off-by: Bill Nell <bnell@redhat.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>

**Commit range:** `6f786f2`..`d886c26` 1. Fix 'DPMetadata' object has no attribute 'max_tokens_across_dp_cpu' by vllm-project/vllm#39107 2. Fix 'Indexer' object has no attribute 'wk' by vllm-project/vllm#38928 3. Fix 'float' object has no attribute 'language_model' by vllm-project/vllm#39240 ### What this PR does / why we need it? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.19.0 - vLLM main: vllm-project/vllm@6f786f2 --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com> Signed-off-by: Meihan-chen <zr010426ztt@outlook.com>

bnellnm added 2 commits April 6, 2026 18:33

remove DP chunking

eb7d677

Signed-off-by: Bill Nell <bnell@redhat.com>

fix test

56bc29a

Signed-off-by: Bill Nell <bnell@redhat.com>

mergify Bot added the nvidia label Apr 6, 2026

github-project-automation Bot added this to NVIDIA Apr 6, 2026

gemini-code-assist Bot reviewed Apr 6, 2026

View reviewed changes

Comment thread vllm/model_executor/layers/fused_moe/config.py Outdated

remove unused bits of DPMetadata

01688fe

Signed-off-by: Bill Nell <bnell@redhat.com>

bnellnm marked this pull request as ready for review April 6, 2026 21:27

bnellnm requested review from WoosukKwon, mgoin, pavanimajety, tjtanaa, tlrmchlsmth and yewentao256 as code owners April 6, 2026 21:27

bnellnm requested review from ProExpertProg and robertgshaw2-redhat April 6, 2026 21:28

bnellnm mentioned this pull request Apr 6, 2026

[Feature]: Unwrap FusedMoE custom op #31985

Open

7 tasks

robertgshaw2-redhat added the ready-run-all-tests Trigger CI with all tests for wide-ranging PRs label Apr 6, 2026

Merge branch 'main' into remove-dp-chunking

0b960ac

bnellnm changed the title ~~[MoE Refactor] Remove dp chunking~~ [MoE Refactor] Remove MoE DP chunking Apr 6, 2026

bnellnm added 2 commits April 6, 2026 22:37

set default max_num_batched_tokens based off all2all backend

6c9951e

Signed-off-by: Bill Nell <bnell@redhat.com>

Merge remote-tracking branch 'nm-vllm/remove-dp-chunking' into remove…

a81d633

…-dp-chunking

bnellnm requested review from hmellor, houseroad and youkaichao as code owners April 6, 2026 22:37

bnellnm added 3 commits April 6, 2026 22:41

fix typo

9f26dac

Signed-off-by: Bill Nell <bnell@redhat.com>

Merge remote-tracking branch 'origin/main' into remove-dp-chunking

f3ab098

add directories for layer test

7f21104

Signed-off-by: Bill Nell <bnell@redhat.com>

mergify Bot added the ci/build label Apr 7, 2026

Merge branch 'main' into remove-dp-chunking

855f275

robertgshaw2-redhat reviewed Apr 9, 2026

View reviewed changes

Comment thread vllm/config/parallel.py

robertgshaw2-redhat added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 9, 2026

robertgshaw2-redhat added the ready-run-all-tests Trigger CI with all tests for wide-ranging PRs label Apr 9, 2026

Merge branch 'main' into remove-dp-chunking

cce840c

bnellnm added 2 commits April 9, 2026 23:39

remove chunking_moe_runner.py

85299e4

Signed-off-by: Bill Nell <bnell@redhat.com>

Merge remote-tracking branch 'nm-vllm/remove-dp-chunking' into remove…

8506ce8

…-dp-chunking

mergify Bot added the needs-rebase label Apr 10, 2026

bnellnm added 2 commits April 10, 2026 04:39

revert test data changes + deepep ll does not work with non-blocked fp8

ec338ba

Signed-off-by: Bill Nell <bnell@redhat.com>

Merge remote-tracking branch 'origin/main' into remove-dp-chunking

8f99d26

Signed-off-by: Bill Nell <bnell@redhat.com>

mergify Bot removed the needs-rebase label Apr 10, 2026

robertgshaw2-redhat merged commit e1e318a into vllm-project:main Apr 14, 2026
142 of 144 checks passed

github-project-automation Bot moved this from Ready to Done in NVIDIA Apr 14, 2026

vllm-agent pushed a commit to vllm-agent/vllm that referenced this pull request Apr 15, 2026

Revert "[MoE Refactor] Remove MoE DP chunking (vllm-project#39107)"

dc45855

This reverts commit e1e318a.

vllm-agent mentioned this pull request Apr 15, 2026

Revert "[MoE Refactor] Remove MoE DP chunking" (#39107) #39853

Draft

ZhanqiuHu mentioned this pull request Apr 15, 2026

[CI Investigate 2026-04-15] Distributed Comm Ops: test_comm_ops.py failure (no test output) ZhanqiuHu/vllm-ci-watch#9

Open

bnellnm deleted the remove-dp-chunking branch April 15, 2026 20:52

Meihan-chen mentioned this pull request Apr 25, 2026

[Misc][main2main] Adapt vllm-ascend to vLLM 0420 vllm-project/vllm-ascend#8428

Merged

Jackmin801 mentioned this pull request Apr 29, 2026

[LoRA] Initial EP support for LoRA #40867

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MoE Refactor] Remove MoE DP chunking#39107

[MoE Refactor] Remove MoE DP chunking#39107
robertgshaw2-redhat merged 22 commits intovllm-project:mainfrom
neuralmagic:remove-dp-chunking

bnellnm commented Apr 6, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

robertgshaw2-redhat commented Apr 6, 2026

Uh oh!

Uh oh!

robertgshaw2-redhat commented Apr 9, 2026

Uh oh!

robertgshaw2-redhat commented Apr 9, 2026

Uh oh!

robertgshaw2-redhat commented Apr 9, 2026

Uh oh!

bnellnm commented Apr 9, 2026

Uh oh!

mergify Bot commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

bnellnm commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

robertgshaw2-redhat commented Apr 6, 2026

Uh oh!

Uh oh!

robertgshaw2-redhat commented Apr 9, 2026

Uh oh!

robertgshaw2-redhat commented Apr 9, 2026

Uh oh!

robertgshaw2-redhat commented Apr 9, 2026

Uh oh!

bnellnm commented Apr 9, 2026

Uh oh!

mergify Bot commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bnellnm commented Apr 6, 2026 •

edited

Loading