[Optimization] Make new_block_ids None if empty by WoosukKwon · Pull Request #23262 · vllm-project/vllm

WoosukKwon · 2025-08-20T14:35:39Z

Since our page size is 16, new_block_ids for decode requests is empty for every 15 out of 16 steps.
However, currently we serialize the empty lists every step.
This PR fixes this unnecessary overheads by making them None before serialization.

TODO: When block_size is different across groups, we should change list[Optional[tuple[list[int], ...]] into list[Optional[tuple[Optional[list[int], ....]]]].

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

github-actions · 2025-08-20T14:36:40Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request introduces an optimization to avoid serializing empty block ID lists for decode requests, which occurs frequently. This is achieved by returning None instead of an empty list structure. The changes are consistently applied across the scheduler and worker components. The type hints have been updated to reflect the optional return value, and consumers of this data now correctly handle the None case. The refactoring to delay block ID generation is a clean approach. The code quality is high, and the changes appear correct and safe.

heheda12345

LGTM!

heheda12345 · 2025-08-21T01:25:31Z

TODO: When block_size is different across groups, we should change list[Optional[tuple[list[int], ...]] into list[Optional[tuple[Optional[list[int], ....]]]].

I don't think this optimization is necessary.

And FYI, we now support mamba by setting block_size=max_model_len to mamba layers, so block_size is already different across different groups.

…ne if empty (#93) Culprit commit: vllm-project/vllm#23262 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai> Signed-off-by: Duncan Moss <djm.moss@gmail.com>

### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>

…ne if empty (#93) Culprit commit: vllm-project/vllm#23262 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Signed-off-by: Marcin Swiniarski <marcin.swiniarski@intel.com>

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by vllm-project/vllm#22387 6. fix qwen3 moe config changes intruoduced by vllm-project/vllm#20562 7. fix kvcache block changes introduced by vllm-project/vllm#23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0c6e40b --------- Signed-off-by: MengqingCao <cmq0113@163.com>

[Optimization] Make new_block_ids None if empty

edba40b

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

WoosukKwon requested a review from heheda12345 August 20, 2025 14:35

WoosukKwon requested review from alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners August 20, 2025 14:35

mergify bot added v1 tpu Related to Google TPUs labels Aug 20, 2025

gemini-code-assist bot reviewed Aug 20, 2025

View reviewed changes

WoosukKwon added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 20, 2025

heheda12345 approved these changes Aug 21, 2025

View reviewed changes

heheda12345 merged commit b029de9 into main Aug 21, 2025
49 of 50 checks passed

heheda12345 deleted the woosuk/block-ids branch August 21, 2025 01:25

QiliangCui mentioned this pull request Aug 21, 2025

Handle new_block_ids is None. vllm-project/tpu-inference#533

Merged

adobrzyn mentioned this pull request Aug 21, 2025

[Upstream fix] Fix after #23262 from upstream - Make new_block_ids None if empty vllm-project/vllm-gaudi#93

Merged

MengqingCao mentioned this pull request Aug 21, 2025

[CI] fix ci vllm-project/vllm-ascend#2464

Merged

djmmoss pushed a commit to djmmoss/vllm that referenced this pull request Aug 21, 2025

[Optimization] Make new_block_ids None if empty (vllm-project#23262)

24c8bb6

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai> Signed-off-by: Duncan Moss <djm.moss@gmail.com>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[Optimization] Make new_block_ids None if empty (vllm-project#23262)

7d6dbfc

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025

[Optimization] Make new_block_ids None if empty (vllm-project#23262)

7add30c

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[Optimization] Make new_block_ids None if empty (vllm-project#23262)

794baf8

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

mengxingkongzhouhan pushed a commit to mengxingkongzhouhan/vllm that referenced this pull request Aug 30, 2025

[Optimization] Make new_block_ids None if empty (vllm-project#23262)

f508125

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025

[Optimization] Make new_block_ids None if empty (vllm-project#23262)

204fd0d

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

ApostaC mentioned this pull request Sep 12, 2025

[KV offload][4/N] Offloading KV connector #22595

Merged

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Optimization] Make new_block_ids None if empty (vllm-project#23262)

9b75b66

Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Optimization] Make new_block_ids None if empty#23262

[Optimization] Make new_block_ids None if empty#23262
heheda12345 merged 1 commit intomainfrom
woosuk/block-ids

WoosukKwon commented Aug 20, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 20, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

heheda12345 left a comment

Uh oh!

heheda12345 commented Aug 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

WoosukKwon commented Aug 20, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 20, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

heheda12345 left a comment

Choose a reason for hiding this comment

Uh oh!

heheda12345 commented Aug 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WoosukKwon commented Aug 20, 2025 •

edited by github-actions bot

Loading