[Bug] Fix fp8 trtllm MoE modular kernel supported routing methods#37346
[Bug] Fix fp8 trtllm MoE modular kernel supported routing methods#37346mgoin merged 2 commits intovllm-project:mainfrom
Conversation
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request addresses a regression by refactoring the _supports_routing_method out of the TrtLlmFp8ExpertsBase class and into the TrtLlmFp8ExpertsMonolithic subclass. This change allows the TrtLlmFp8ExpertsModular kernel to correctly support all routing methods by inheriting the default permissive behavior. The _supports_quant_scheme method has also been moved into the subclasses, tailoring the supported quantization schemes for both the monolithic and modular implementations. The logic appears sound and I have not identified any issues with the implementation.
|
cc @EdalatiAli |
|
Testing the command in #35448 throws error (on main as well): @EdalatiAli Can you check if I am doing anything wrong? |
|
@mgoin Actually just realize that |
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
|
@wzhao18 Thanks for catching and fixing this. #35448 introduced the regression by adding |
I'm currently investigating the issue. |
|
@wzhao18 This error is raised because |
|
@EdalatiAli Got it. Thanks! Adding |
|
@mgoin could you help review this one? |
…lm-project#37346) Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
…lm-project#37346) Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
…lm-project#37346) Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
…lm-project#37346) Signed-off-by: wzhao18 <wzhao18.sz@gmail.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
…lm-project#37346) Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
Purpose
#35448 introduces some regression for adding
_supports_routing_methodtoTrtLlmFp8ExpertsBase. However, this function is only needed for the monolithic kernel not the modular kernel, which should support any routing method as routing is done external to the kernel. This PR fixes this regression.Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.