Skip to content

CustomOp: grouped topk#29575

Merged
DarkLight1337 merged 1 commit intovllm-project:mainfrom
xinyu-intel:dev/xinyu/grouped_topk
Dec 17, 2025
Merged

CustomOp: grouped topk#29575
DarkLight1337 merged 1 commit intovllm-project:mainfrom
xinyu-intel:dev/xinyu/grouped_topk

Conversation

@xinyu-intel
Copy link
Contributor

@xinyu-intel xinyu-intel commented Nov 27, 2025

Purpose

plugin can register grouped_topk op for deepseek_v2/3 workloads.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism for platforms to provide an optimized grouped_topk operation. The changes are a good step towards platform-specific optimizations. However, I've identified a critical issue with the new interface in Platform that could lead to a runtime AttributeError. My feedback includes a suggestion to make the interface more robust and prevent this potential crash.

Comment on lines +657 to +661
def has_optimized_grouped_topk(cls) -> bool:
"""
Return if current platform has optimized grouped_topk op.
"""
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

If this method returns True, the calling code in FusedMoE.select_experts will access current_platform.grouped_topk. This implicit contract is fragile and will cause an AttributeError if a platform subclass returns True but does not define the grouped_topk attribute.

To make the interface more robust, this check should be tied to the presence of the grouped_topk attribute. This also simplifies platform implementations, as they only need to define grouped_topk to enable the optimized path.

Suggested change
def has_optimized_grouped_topk(cls) -> bool:
"""
Return if current platform has optimized grouped_topk op.
"""
return False
def has_optimized_grouped_topk(cls) -> bool:
"""
Return if current platform has optimized grouped_topk op.
"""
return hasattr(cls, "grouped_topk") and cls.grouped_topk is not None

@LucasWilkinson
Copy link
Collaborator

cc @Yikun

I think it might be a bit excessive to have a has_... plugin style for all ops not sure if this would be a good use for CustomOP (cc @ProExpertProg )

@ProExpertProg
Copy link
Collaborator

Yeah let's make this a CustomOp instead

@Yikun
Copy link
Member

Yikun commented Nov 29, 2025

Yep, agree.

@xinyu-intel
Copy link
Contributor Author

cc @Yikun

I think it might be a bit excessive to have a has_... plugin style for all ops not sure if this would be a good use for CustomOP (cc @ProExpertProg )

thx, let me take a try.

@xinyu-intel xinyu-intel force-pushed the dev/xinyu/grouped_topk branch from a64cee4 to c149b12 Compare December 4, 2025 01:52
@xinyu-intel xinyu-intel changed the title platform: optimized grouped topk op CustomOp: grouped topk Dec 4, 2025
@xinyu-intel xinyu-intel force-pushed the dev/xinyu/grouped_topk branch from c149b12 to f5c8b46 Compare December 4, 2025 02:06
@xinyu-intel xinyu-intel force-pushed the dev/xinyu/grouped_topk branch from f5c8b46 to ee1c59d Compare December 4, 2025 02:07
@xinyu-intel
Copy link
Contributor Author

@LucasWilkinson @Yikun @ProExpertProg Hi, re-implement with CustomOp, please review again, thx.

@jikunshang jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 4, 2025
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
@xinyu-intel xinyu-intel force-pushed the dev/xinyu/grouped_topk branch from ee1c59d to 295c811 Compare December 5, 2025 05:49
Copy link
Member

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, thanks.

@MengqingCao Do you have time to do a e2e confirm on next monday, thanks.

@MengqingCao
Copy link
Contributor

Overall looks good, thanks.

@MengqingCao Do you have time to do a e2e confirm on next monday, thanks.

Yep, I'll make a test on vllm-ascend today.

BTW, I reviewed this pr and noticed that we could also absorb the logic on grouped_topk of rocm into GroupedTopk, WDYT? @xinyu-intel
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/fused_moe/layer.py#L1579-L1585

@xinyu-intel
Copy link
Contributor Author

Overall looks good, thanks.
@MengqingCao Do you have time to do a e2e confirm on next monday, thanks.

Yep, I'll make a test on vllm-ascend today.

BTW, I reviewed this pr and noticed that we could also absorb the logic on grouped_topk of rocm into GroupedTopk, WDYT? @xinyu-intel https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/fused_moe/layer.py#L1579-L1585

Great ideas. However, I don't have the AMD platform to validate. Can we make it happen after this PR?

Copy link
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MengqingCao
Copy link
Contributor

Overall looks good, thanks.
@MengqingCao Do you have time to do a e2e confirm on next monday, thanks.

Yep, I'll make a test on vllm-ascend today.
BTW, I reviewed this pr and noticed that we could also absorb the logic on grouped_topk of rocm into GroupedTopk, WDYT? @xinyu-intel https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/fused_moe/layer.py#L1579-L1585

Great ideas. However, I don't have the AMD platform to validate. Can we make it happen after this PR?

Yep, I'm okay with that. Overall LGTM!

Copy link
Member

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we'd better to land this in 2025 😁.

Release cut time: 12.15 (0.13.0) .

@LucasWilkinson @ProExpertProg Any concern?

@DarkLight1337 DarkLight1337 merged commit 3b1d440 into vllm-project:main Dec 17, 2025
51 checks passed
NickLucche pushed a commit to NickLucche/vllm that referenced this pull request Dec 17, 2025
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
adobrzyn pushed a commit to vllm-project/vllm-gaudi that referenced this pull request Dec 18, 2025
Hourly fixes:
CustomOp: grouped topk #647 - depends on
vllm-project/vllm#29575
Fix HpuCommunicator.dispatch #732 - This is fix for upstream changes:
https://github.com/vllm-project/vllm/pull/30014/files

Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Majid-Taheri pushed a commit to Majid-Taheri/vllm that referenced this pull request Dec 23, 2025
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
Signed-off-by: Ubuntu <mjtaheri68@gmail.com>
@tjtanaa
Copy link
Collaborator

tjtanaa commented Dec 30, 2025

@xinyu-intel can you follow up with a proper unit test for custom op GroupedTopk. Because we need to modify the custom_op list to properly trigger the platform dependent forward_ functions. The unit tests in this PR does not address that. So it will always call forward_native.

@xinyu-intel
Copy link
Contributor Author

@xinyu-intel can you follow up with a proper unit test for custom op GroupedTopk. Because we need to modify the custom_op list to properly trigger the platform dependent forward_ functions. The unit tests in this PR does not address that. So it will always call forward_native.

sure, will do.

dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
rajanintel24 pushed a commit to rajanintel24/vllm-gaudi that referenced this pull request Feb 11, 2026
 (vllm-project#735)

Hourly fixes:
CustomOp: grouped topk vllm-project#647 - depends on
vllm-project/vllm#29575
Fix HpuCommunicator.dispatch vllm-project#732 - This is fix for upstream changes:
https://github.com/vllm-project/vllm/pull/30014/files

Signed-off-by: Iryna Boiko <iboiko@habana.ai>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants