Skip to content

[Misc][Refactor] Decouple quant methods from FusedMoE#29505

Closed
bnellnm wants to merge 4 commits intovllm-project:mainfrom
neuralmagic:fused-moe-params
Closed

[Misc][Refactor] Decouple quant methods from FusedMoE#29505
bnellnm wants to merge 4 commits intovllm-project:mainfrom
neuralmagic:fused-moe-params

Conversation

@bnellnm
Copy link
Collaborator

@bnellnm bnellnm commented Nov 26, 2025

Purpose

Depends on #30519

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify
Copy link

mergify bot commented Dec 9, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @bnellnm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@robertgshaw2-redhat
Copy link
Collaborator

generally looks like its on the right track - I think the next step would be to then move the router outside of the fused moe class and call it directly

@robertgshaw2-redhat
Copy link
Collaborator

but lets do that in the follow up. the flashinfer case (with the fused router) is not ideal

@mergify
Copy link

mergify bot commented Dec 11, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @bnellnm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 11, 2025
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
@mergify mergify bot removed the needs-rebase label Dec 11, 2025
@bnellnm bnellnm changed the title [Misc][Refactor] Decouple quant methods from FusedMoE, add FusedMoERouter object [Misc][Refactor] Decouple quant methods from FusedMoE Dec 11, 2025
@bnellnm
Copy link
Collaborator Author

bnellnm commented Dec 11, 2025

but lets do that in the follow up. the flashinfer case (with the fused router) is not ideal

Yeah, that makes it a bit more awkward.

@mergify
Copy link

mergify bot commented Dec 16, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @bnellnm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 16, 2025
@github-project-automation github-project-automation bot moved this to Backlog in MoE Refactor Jan 9, 2026
@robertgshaw2-redhat robertgshaw2-redhat moved this from Backlog to In progress in MoE Refactor Jan 9, 2026
@robertgshaw2-redhat robertgshaw2-redhat moved this from In progress to Backlog in MoE Refactor Jan 9, 2026
@bnellnm bnellnm closed this Mar 13, 2026
@github-project-automation github-project-automation bot moved this from Backlog to Done in MoE Refactor Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants