Temporary disable persistent topk by zyongye · Pull Request #41442 · vllm-project/vllm

zyongye · 2026-05-01T02:29:25Z

No description provided.

Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request modifies the sparse_attn_indexer to remove the 1024 token size from the persistent top-k optimization on CUDA platforms. Feedback suggests that if the goal is to disable this optimization due to stability or correctness issues, it should likely be disabled for all supported sizes (512 and 2048) rather than just 1024 to ensure consistency with the PR's objective.

gemini-code-assist · 2026-05-01T02:34:22Z

        topk_indices = topk_indices_buffer[:num_padded_tokens, :topk_tokens]

-        if current_platform.is_cuda() and topk_tokens in (512, 1024, 2048):
+        if current_platform.is_cuda() and topk_tokens in (512, 2048):


The PR title 'Temporary disable persistent topk' suggests an intent to disable the persistent topk optimization entirely. However, the current implementation only removes the 1024 case, leaving it enabled for 512 and 2048. If the kernel is being disabled due to a general issue (e.g., stability or correctness), it should likely be disabled for all supported sizes to ensure the workaround is effective across all configurations.

This reverts commit 05ebca5.

Keep `topk_tokens == 1024` on the persistent_topk path on Blackwell (SM10x), but disable it on Hopper and other CUDA archs so the original revert (vllm-project#41442) behavior is preserved there. Co-authored-by: Claude <noreply@anthropic.com>

This reverts commit a4debbd. Signed-off-by: zixi-qi <zixi@inferact.ai>

Keep `topk_tokens == 1024` on the persistent_topk path on Blackwell (SM10x), but disable it on Hopper and other CUDA archs so the original revert (vllm-project#41442) behavior is preserved there. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: zixi-qi <zixi@inferact.ai>

fix

8dd6b2d

Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>

claude Bot reviewed May 1, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 1, 2026

View reviewed changes

ywang96 added this to the v0.20.1 milestone May 1, 2026

ywang96 approved these changes May 1, 2026

View reviewed changes

ywang96 merged commit 05ebca5 into vllm-project:releases/v0.20.1 May 1, 2026
6 of 7 checks passed

khluu mentioned this pull request May 1, 2026

v0.20.1 Cherry pick PRs #41435

Closed

khluu added a commit that referenced this pull request May 2, 2026

Revert "Temporary disable persistent topk (#41442)"

a4debbd

This reverts commit 05ebca5.

zixi-qi mentioned this pull request May 4, 2026

Temporary disable persistent topk for Hopper #41605

Merged

zixi-qi added a commit to zixi-qi/vllm that referenced this pull request May 4, 2026

Revert "Revert "Temporary disable persistent topk (vllm-project#41442)""

26cc501

This reverts commit a4debbd. Signed-off-by: zixi-qi <zixi@inferact.ai>

manueldomke mentioned this pull request May 5, 2026

[Bug]: DeepSeek V4 hangs when input length exceeds 64k tokens (vLLM deepseekv4-cu129 image) #41125

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Temporary disable persistent topk#41442

Temporary disable persistent topk#41442
ywang96 merged 1 commit intovllm-project:releases/v0.20.1from
zyongye:persistent_topk_disable

zyongye commented May 1, 2026

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

zyongye commented May 1, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants