Skip to content

CustomOp: grouped topk#647

Closed
xinyu-intel wants to merge 1 commit intovllm-project:mainfrom
xinyu-intel:dev/xinyu/grouped_topk
Closed

CustomOp: grouped topk#647
xinyu-intel wants to merge 1 commit intovllm-project:mainfrom
xinyu-intel:dev/xinyu/grouped_topk

Conversation

@xinyu-intel
Copy link
Copy Markdown
Contributor

@xinyu-intel xinyu-intel commented Nov 27, 2025

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an optimized grouped top-k operation implementation for the Gaudi platform. The optimization involves intelligent handling of expert selection for mixture-of-experts (MoE) models, with special logic for batch sizes and optional score correction bias.

  • Adds has_optimized_grouped_topk() method returning True to indicate platform support
  • Implements grouped_topk() method with scoring functions (softmax/sigmoid), group-based expert selection, and optional bias correction
  • Includes adaptive algorithm selection based on token count threshold (1024)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread vllm_gaudi/platform.py Outdated
Comment thread vllm_gaudi/platform.py Outdated
Comment thread vllm_gaudi/platform.py Outdated
Comment thread vllm_gaudi/platform.py Outdated
@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
0353d2e162cbda776d9dbfe026e65303204a7f1f

@xinyu-intel xinyu-intel force-pushed the dev/xinyu/grouped_topk branch from e191a3d to 0cb8f4f Compare December 4, 2025 01:54
@xinyu-intel xinyu-intel changed the title platform: optimize grouped topk op CustomOp: grouped topk Dec 4, 2025
@xinyu-intel xinyu-intel force-pushed the dev/xinyu/grouped_topk branch from 0cb8f4f to 2b24404 Compare December 4, 2025 02:18
@xuechendi
Copy link
Copy Markdown
Collaborator

is that possible to monkey patch from vllm.model_executor.layers.fused_moe.fused_moe.grouped_topk ?

I think we can push for the vllm-project/vllm#29575 after, since it usually need some discussion and alignment.

@xuechendi xuechendi self-assigned this Dec 10, 2025
@xinyu-intel
Copy link
Copy Markdown
Contributor Author

is that possible to monkey patch from vllm.model_executor.layers.fused_moe.fused_moe.grouped_topk ?

I think we can push for the vllm-project/vllm#29575 after, since it usually need some discussion and alignment.

#708

@xinyu-intel xinyu-intel force-pushed the dev/xinyu/grouped_topk branch 2 times, most recently from 635b660 to 7b996a9 Compare December 17, 2025 11:08
Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>
iboiko-habana added a commit to iboiko-habana/vllm-gaudi that referenced this pull request Dec 17, 2025


Signed-off-by: Iryna Boiko <iboiko@habana.ai>
@iboiko-habana
Copy link
Copy Markdown
Collaborator

it is merged into #735

adobrzyn pushed a commit that referenced this pull request Dec 18, 2025
Hourly fixes:
CustomOp: grouped topk #647 - depends on
vllm-project/vllm#29575
Fix HpuCommunicator.dispatch #732 - This is fix for upstream changes:
https://github.com/vllm-project/vllm/pull/30014/files

Signed-off-by: Iryna Boiko <iboiko@habana.ai>
iboiko-habana added a commit to iboiko-habana/vllm-gaudi that referenced this pull request Dec 19, 2025
iboiko-habana added a commit that referenced this pull request Dec 19, 2025
PatrykWo pushed a commit that referenced this pull request Dec 19, 2025
PatrykWo pushed a commit that referenced this pull request Dec 19, 2025
rajanintel24 pushed a commit to rajanintel24/vllm-gaudi that referenced this pull request Feb 11, 2026
 (vllm-project#735)

Hourly fixes:
CustomOp: grouped topk vllm-project#647 - depends on
vllm-project/vllm#29575
Fix HpuCommunicator.dispatch vllm-project#732 - This is fix for upstream changes:
https://github.com/vllm-project/vllm/pull/30014/files

Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants