[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to further optimize block_pool #21222

Jialin · 2025-07-19T09:39:37Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Most of the block_pool operators are on critical path

forward path is hard blocked by kv block allocation
decode cycle end is hard blocked by kv block free

In this PR, we're focusing on further optimization these 2 operators.

Bulk popleft instead of popleft n times

Originally, in block_pool.get_new_blocks, we popped blocks one at a time, which would triggered the second block to fake head connections (which are unnecessary operations as the second block might be popped right after this).

As we knew total number of blocks to pop ahead, we could simply introduce popleft_n for buck popleft. Overall, the number link list operations to linked list of popleft_n would only be half of n popleft.

Bulk append instead of append n times

Similar, in block_pool.free_blocks, we invoke append one at a time. Introducing bulk append would also cut link list operations by half.

Test Plan

Evaluate with benchmark scripts
Evaluate with benchmark_blockpoll
New Unit Test for append_n and popleft_n are added

Test Result

benchmark scripts

Get new blocks improved from 0.15ms to 0.008ms
Free new blocks improved from 33us to 9us

After

Before

benchmark_blockpool

As expected, get_blocks and free_blocks times are cut in half.

After

Before

(Optional) Documentation Update

github-actions · 2025-07-19T09:39:45Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Jialin · 2025-07-19T09:40:11Z

resolve #21141

gemini-code-assist

Code Review

This pull request introduces popleft_n and append_n methods to FreeKVCacheBlockQueue for bulk operations, optimizing get_new_blocks and free_blocks in BlockPool. Benchmark results show significant improvements. To enhance robustness, I've suggested materializing the ordered_blocks iterable to a list in free_blocks to prevent potential OOM errors.

vllm/v1/core/block_pool.py

vllm/v1/core/kv_cache_utils.py

vllm/v1/core/block_pool.py

vllm/v1/core/kv_cache_utils.py

vllm/v1/core/block_pool.py

vllm/v1/core/kv_cache_utils.py

houseroad

Looks good to me. Impressive results, and two nits to consider to address.

Signed-off-by: Jialin Ouyang <[email protected]>

…ations) Signed-off-by: Jialin Ouyang <[email protected]>

Signed-off-by: Jialin Ouyang <[email protected]>

njhill

Thanks @Jialin!

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: qizixi <[email protected]>

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: x22x22 <[email protected]>

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]>

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: Paul Pak <[email protected]>

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]> Signed-off-by: Diego-Castan <[email protected]>

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]>

Jialin requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners July 19, 2025 09:39

mergify bot added the v1 label Jul 19, 2025

gemini-code-assist bot reviewed Jul 19, 2025

View reviewed changes

vllm/v1/core/block_pool.py Outdated Show resolved Hide resolved

Jialin mentioned this pull request Jul 18, 2025

[Performance]: Opportunities to speed up BlockPool processing #21141

Closed

5 tasks

DarkLight1337 requested a review from heheda12345 July 19, 2025 12:19

Jialin force-pushed the blockpool branch from a3253a5 to a3042bd Compare July 20, 2025 10:12

njhill reviewed Jul 20, 2025

View reviewed changes

vllm/v1/core/kv_cache_utils.py Outdated Show resolved Hide resolved

vllm/v1/core/kv_cache_utils.py Outdated Show resolved Hide resolved

vllm/v1/core/block_pool.py Outdated Show resolved Hide resolved

vllm/v1/core/block_pool.py Outdated Show resolved Hide resolved

Jialin mentioned this pull request Jul 21, 2025

[Core] Minimize number of dict lookup in _maybe_evict_cached_block #21281

Merged

4 tasks

njhill reviewed Jul 21, 2025

View reviewed changes

vllm/v1/core/kv_cache_utils.py Outdated Show resolved Hide resolved

vllm/v1/core/block_pool.py Outdated Show resolved Hide resolved

vllm/v1/core/block_pool.py Outdated Show resolved Hide resolved

Jialin force-pushed the blockpool branch from 073075f to ca9fca3 Compare July 21, 2025 22:14

houseroad added performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed labels Jul 21, 2025

houseroad reviewed Jul 22, 2025

View reviewed changes

vllm/v1/core/kv_cache_utils.py Outdated Show resolved Hide resolved

houseroad reviewed Jul 22, 2025

View reviewed changes

vllm/v1/core/kv_cache_utils.py Outdated Show resolved Hide resolved

houseroad approved these changes Jul 22, 2025

View reviewed changes

Jialin added 7 commits July 21, 2025 22:20

Introduce popleft_n and append_n in FreeKVCacheBlockQueue

9353288

Signed-off-by: Jialin Ouyang <[email protected]>

Fix free_blocks to correctly iterate ordered_blocks twice

7dd32ff

Signed-off-by: Jialin Ouyang <[email protected]>

Materialize iterable instead of using itertools.tee

d62f3e8

Signed-off-by: Jialin Ouyang <[email protected]>

Address comments

a7b16ba

Signed-off-by: Jialin Ouyang <[email protected]>

Address comments (further simplify implementation and avoid list iter…

429e723

…ations) Signed-off-by: Jialin Ouyang <[email protected]>

Added a TODO to clean up incr_ref and decr_ref

3655119

Signed-off-by: Jialin Ouyang <[email protected]>

Address comments

ad59a94

Signed-off-by: Jialin Ouyang <[email protected]>

Jialin force-pushed the blockpool branch from ca9fca3 to ad59a94 Compare July 22, 2025 05:23

houseroad enabled auto-merge (squash) July 22, 2025 05:23

njhill approved these changes Jul 22, 2025

View reviewed changes

vllm-bot merged commit ed25054 into vllm-project:main Jul 22, 2025
64 of 66 checks passed

Pradyun92 pushed a commit to Pradyun92/vllm that referenced this pull request Aug 6, 2025

[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to f…

a7521ad

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]>

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to f…

22a3904

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to f…

01377bf

…urther optimize block_pool (vllm-project#21222) Signed-off-by: Jialin Ouyang <[email protected]>

markmc mentioned this pull request Oct 15, 2025

feat: Add KV cache block lifetime tracking with Prometheus metrics #25736

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to further optimize block_pool #21222

[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to further optimize block_pool #21222

Uh oh!

Jialin commented Jul 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 19, 2025

Uh oh!

Jialin commented Jul 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

houseroad left a comment

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to further optimize block_pool #21222

[Core] Introduce popleft_n and append_n in FreeKVCacheBlockQueue to further optimize block_pool #21222

Uh oh!

Conversation

Jialin commented Jul 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Bulk popleft instead of popleft n times

Bulk append instead of append n times

Test Plan

Test Result

benchmark scripts

benchmark_blockpool

(Optional) Documentation Update

Uh oh!

github-actions bot commented Jul 19, 2025

Uh oh!

Jialin commented Jul 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Jialin commented Jul 19, 2025 •

edited by github-actions bot

Loading