Skip to content

[MoE Refactor] Turn ChunkingMoERunner into a wrapper so it can be used with any MoERunner subclass.#35559

Closed
bnellnm wants to merge 33 commits intovllm-project:mainfrom
neuralmagic:moe-runner-5
Closed

[MoE Refactor] Turn ChunkingMoERunner into a wrapper so it can be used with any MoERunner subclass.#35559
bnellnm wants to merge 33 commits intovllm-project:mainfrom
neuralmagic:moe-runner-5

Conversation

@bnellnm
Copy link
Copy Markdown
Collaborator

@bnellnm bnellnm commented Feb 27, 2026

Purpose

Turn ChunkingMoERunner into a wrapper so it can be used with any MoERunner subclass.
Note: _maybe_dispatch + _maybe_combine must be nops when using ChunkingMoERunner

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
@mergify mergify Bot added the nvidia label Feb 27, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Feb 27, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @bnellnm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant and well-executed refactoring of the Mixture of Experts (MoE) implementation. Turning ChunkingMoERunner into a wrapper and integrating zero expert functionality directly into FusedMoE and a new ZeroExpertRouter greatly improves modularity and code clarity. The introduction of MoERunner and SharedExperts abstractions are also excellent design choices. I've identified one critical issue where an attribute is accessed before initialization, which would cause a runtime error. A code suggestion is provided to fix this. Apart from that, the changes look solid and are a great improvement to the codebase.

Comment thread vllm/model_executor/layers/fused_moe/layer.py
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
@bnellnm
Copy link
Copy Markdown
Collaborator Author

bnellnm commented Mar 2, 2026

combined with #35326

@bnellnm bnellnm closed this Mar 2, 2026
@github-project-automation github-project-automation Bot moved this to Done in NVIDIA Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant