[Misc][Refactor] Add FusedMoERouter object#30519
[Misc][Refactor] Add FusedMoERouter object#30519robertgshaw2-redhat merged 7 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a FusedMoERouter abstraction to decouple the expert routing logic from the FusedMoE layer. This is a significant refactoring that touches many files across the Mixture-of-Experts (MoE) implementation, including various quantization methods. The changes are applied consistently, with the apply methods of FusedMoEMethodBase subclasses now accepting a router object. A default implementation, FusedMoERouterImpl, is provided to delegate routing calls back to the FusedMoE layer, ensuring no functional changes in this PR. This is a solid architectural improvement that paves the way for easier implementation of new routing strategies. The changes are well-executed and I found no issues.
40201d7 to
5fab8b3
Compare
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
This pull request has merge conflicts that must be resolved before it can be |
8886257 to
a8932a6
Compare
|
Hi @bnellnm, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
1 similar comment
|
Hi @bnellnm, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
This pull request has merge conflicts that must be resolved before it can be |
d793a89 to
0d15e49
Compare
|
Hi @bnellnm, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
| @@ -293,6 +294,23 @@ def maybe_roundup_hidden_size( | |||
| return hidden_size | |||
|
|
|||
|
|
|||
| class FusedMoERouterImpl(FusedMoERouter): | |||
There was a problem hiding this comment.
Is the idea that eventually we will have N of these, 1 for each routing_method_type?
And then that FusedMoERouter would not have the layer as an attr?
|
this generally looks good to me. The pre-commit is a real error To confirm, the long term plan is something like this: class DeepSeekV2MoELayer:
- gate()
- sharedexpert()
- router() << so the "router" is not owned by the MoEMethod
- experts() |
|
This pull request has merge conflicts that must be resolved before it can be |
|
LGTM |
|
This pull request has merge conflicts that must be resolved before it can be |
|
Hi @bnellnm, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Bill Nell <bnell@redhat.com>
Head branch was pushed to by a user without write access
35dd140 to
a3deb62
Compare
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Purpose
Add abstract
FusedMoERouterclass that providesselect_experts. There's a concrete subclass that wrapsFusedMoE._select_expertsthat is passed to all the quantization methods instead of callingFusedMoE._select_expertsdirectly. This is a step on teasing out the routing logic fromFusedMoE.apply.See #28408
Test Plan
Existing CI
Test Result
cc @robertgshaw2-redhat
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.