Revert "feat(moe): Add is_act_and_mul=False support for Triton MoE kernels"#31978
Revert "feat(moe): Add is_act_and_mul=False support for Triton MoE kernels"#31978
Conversation
…rnels (#…" This reverts commit 25eef3d.
There was a problem hiding this comment.
Code Review
This pull request reverts the is_act_and_mul=False feature for Triton MoE kernels. The changes are mostly correct, but the revert appears to be incomplete. I've identified dead code in vllm/model_executor/layers/fused_moe/layer.py that should be removed to finalize the revert. Additionally, to fully remove the feature, the is_act_and_mul attribute should also be removed from FusedMoEConfig in vllm/model_executor/layers/fused_moe/config.py.
| if not current_platform.is_cuda(): | ||
| raise NotImplementedError( | ||
| "is_act_and_mul=False is supported only for CUDA, or ROCm " | ||
| "(when AITER MoE is disabled) for now" | ||
| "is_act_and_mul=False is supported only for CUDA for now" | ||
| ) |
There was a problem hiding this comment.
The surrounding if not self.moe_config.is_act_and_mul: block (starting at line 584) is now dead code because this PR reverts the is_act_and_mul=False feature. With this revert, self.moe_config.is_act_and_mul will always be True. To complete the revert and improve code clarity, this entire if block (lines 584-607) should be removed. As a follow-up, the is_act_and_mul attribute should also be removed from FusedMoEConfig in vllm/model_executor/layers/fused_moe/config.py.
tjtanaa
left a comment
There was a problem hiding this comment.
Thank you for bringing this up. I didn't notice there is a parallel PR.
…rnels" (vllm-project#31978) Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Reverts #31645, my reasoning is here #31645 (comment)