[MoE Refactor] Turn ChunkingMoERunner into a wrapper so it can be used with any MoERunner subclass.#35559
[MoE Refactor] Turn ChunkingMoERunner into a wrapper so it can be used with any MoERunner subclass.#35559bnellnm wants to merge 33 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Code Review
This pull request is a significant and well-executed refactoring of the Mixture of Experts (MoE) implementation. Turning ChunkingMoERunner into a wrapper and integrating zero expert functionality directly into FusedMoE and a new ZeroExpertRouter greatly improves modularity and code clarity. The introduction of MoERunner and SharedExperts abstractions are also excellent design choices. I've identified one critical issue where an attribute is accessed before initialization, which would cause a runtime error. A code suggestion is provided to fix this. Apart from that, the changes look solid and are a great improvement to the codebase.
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
|
combined with #35326 |
Purpose
Turn ChunkingMoERunner into a wrapper so it can be used with any MoERunner subclass.
Note:
_maybe_dispatch+_maybe_combinemust be nops when usingChunkingMoERunnerTest Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.