[MoE Refactor] FusedMoE/MoERunner inversion refactor#41184
Conversation
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
|
Sorry, I'll clarify. By elevate I meant move it up one level in the model structure. So that Currently, It would be better if the |
Ok, I'm not opposed to that but I think it would be too much for this PR. |
|
Ok, happy for it to be a follow up. It would simplify model loading and the Transformers modelling backend if we did this. |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Bill Nell <bnell@redhat.com>
|
cc @divakar-amd can you review this PR too? |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Purpose
Invert the
MoERunner<->FusedMoErelationshipThe
MoERunnerwill own theFusedMoE(renamed toRoutedExperts)The
FusedMoEclass will go away.Some model weight loading code needed updating since the paths for MoE weights now has an extra level, e.g.
.experts.<foo>is now.experts.routed_experts.<foo>Based on the following PRs:
#41997 - Move capture state out of FusedMoE
cc @yzong-rh
Test Plan
CI + MoE refactoring tests
Run all MoE layer tests (including SP tests from (#41299)
Run tests from #39956
Ran model loading tests for the following models:
Test Result
Waiting for CI results
All model loading tests passed except for the following which I was unable to verify (due to OOM or other issues):
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.