CustomOp: grouped topk#29575
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism for platforms to provide an optimized grouped_topk operation. The changes are a good step towards platform-specific optimizations. However, I've identified a critical issue with the new interface in Platform that could lead to a runtime AttributeError. My feedback includes a suggestion to make the interface more robust and prevent this potential crash.
vllm/platforms/interface.py
Outdated
| def has_optimized_grouped_topk(cls) -> bool: | ||
| """ | ||
| Return if current platform has optimized grouped_topk op. | ||
| """ | ||
| return False |
There was a problem hiding this comment.
If this method returns True, the calling code in FusedMoE.select_experts will access current_platform.grouped_topk. This implicit contract is fragile and will cause an AttributeError if a platform subclass returns True but does not define the grouped_topk attribute.
To make the interface more robust, this check should be tied to the presence of the grouped_topk attribute. This also simplifies platform implementations, as they only need to define grouped_topk to enable the optimized path.
| def has_optimized_grouped_topk(cls) -> bool: | |
| """ | |
| Return if current platform has optimized grouped_topk op. | |
| """ | |
| return False | |
| def has_optimized_grouped_topk(cls) -> bool: | |
| """ | |
| Return if current platform has optimized grouped_topk op. | |
| """ | |
| return hasattr(cls, "grouped_topk") and cls.grouped_topk is not None |
|
cc @Yikun I think it might be a bit excessive to have a |
|
Yeah let's make this a |
|
Yep, agree. |
thx, let me take a try. |
a64cee4 to
c149b12
Compare
c149b12 to
f5c8b46
Compare
f5c8b46 to
ee1c59d
Compare
|
@LucasWilkinson @Yikun @ProExpertProg Hi, re-implement with CustomOp, please review again, thx. |
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
ee1c59d to
295c811
Compare
Yikun
left a comment
There was a problem hiding this comment.
Overall looks good, thanks.
@MengqingCao Do you have time to do a e2e confirm on next monday, thanks.
Yep, I'll make a test on vllm-ascend today. BTW, I reviewed this pr and noticed that we could also absorb the logic on |
Great ideas. However, I don't have the AMD platform to validate. Can we make it happen after this PR? |
Yep, I'm okay with that. Overall LGTM! |
Yikun
left a comment
There was a problem hiding this comment.
LGTM, we'd better to land this in 2025 😁.
Release cut time: 12.15 (0.13.0) .
@LucasWilkinson @ProExpertProg Any concern?
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
Hourly fixes: CustomOp: grouped topk #647 - depends on vllm-project/vllm#29575 Fix HpuCommunicator.dispatch #732 - This is fix for upstream changes: https://github.com/vllm-project/vllm/pull/30014/files Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com> Signed-off-by: Ubuntu <mjtaheri68@gmail.com>
|
@xinyu-intel can you follow up with a proper unit test for custom op GroupedTopk. Because we need to modify the |
sure, will do. |
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
(vllm-project#735) Hourly fixes: CustomOp: grouped topk vllm-project#647 - depends on vllm-project/vllm#29575 Fix HpuCommunicator.dispatch vllm-project#732 - This is fix for upstream changes: https://github.com/vllm-project/vllm/pull/30014/files Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
Purpose
plugin can register grouped_topk op for deepseek_v2/3 workloads.