Skip to content

[Refactor] Small refactor for group topk#30562

Merged
yewentao256 merged 5 commits intomainfrom
wentao-small-refactor
Dec 16, 2025
Merged

[Refactor] Small refactor for group topk#30562
yewentao256 merged 5 commits intomainfrom
wentao-small-refactor

Conversation

@yewentao256
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 commented Dec 12, 2025

Purpose

Thanks for comments from @mgoin #30159 (review) and #29125 (review).

This PR fixes this.

The division could give us some performance benefits as well

Test

(wentao) wentao@dgxB200-09:~/vllm-source$ pytest tests/kernels/moe/test_grouped_topk.py 
================================= test session starts =================================
platform linux -- Python 3.12.3, pytest-9.0.1, pluggy-1.6.0
rootdir: /home/wentao/vllm-source
configfile: pyproject.toml
plugins: forked-1.6.0, anyio-4.11.0, timeout-2.4.0
collected 96 items                                                                    

tests/kernels/moe/test_grouped_topk.py ........................................ [ 41%]
........................................................                        [100%]

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== 96 passed, 2 warnings in 35.13s ===========================

Signed-off-by: yewentao256 <zhyanwentao@126.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@mergify mergify bot added the v1 label Dec 12, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a couple of refactorings in the grouped_topk CUDA kernel. The apply_scoring function is made more robust by explicitly handling SCORING_NONE and adding a static_assert for unsupported scoring functions, which is a good improvement. The main change is a performance optimization in group_idx_and_topk_idx_kernel where a division is moved out of a loop. While this improves performance, I've raised a concern about the change in the order of floating-point operations, which could potentially affect numerical precision and determinism. The removal of a commented-out line in a test file is a minor but good cleanup.

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 12, 2025
Copy link
Copy Markdown
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, thank you!

@yewentao256 yewentao256 merged commit f21f5ea into main Dec 16, 2025
95 checks passed
@yewentao256 yewentao256 deleted the wentao-small-refactor branch December 16, 2025 19:51
NickLucche pushed a commit to NickLucche/vllm that referenced this pull request Dec 17, 2025
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Majid-Taheri pushed a commit to Majid-Taheri/vllm that referenced this pull request Dec 23, 2025
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Signed-off-by: Ubuntu <mjtaheri68@gmail.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants