[V0 Deprecation] Refactor kv cache from list to element by yewentao256 · Pull Request #37487 · vllm-project/vllm

yewentao256 · 2026-03-18T22:19:58Z

Purpose

A follow up for #37195 of removing the virtual engine, this PR further refactor the kv cache from list to element to clean the code

Tests in CI

Signed-off-by: yewentao256 <zhyanwentao@126.com>

gemini-code-assist

Code Review

This pull request refactors the kv_cache by removing the outer list wrapper, simplifying its structure from a list of one element to just the element itself (a tensor or a tuple of tensors). This change is consistently applied across various components, including attention layers, mamba-based layers, and their corresponding test files. The modifications simplify code by removing unnecessary [0] indexing when accessing the kv_cache. The change in _cleanup_profiling_kv_cache is a good addition that makes the cleanup logic more robust to the different types of kv_cache. The refactoring appears to be correct and improves code clarity.

hmellor · 2026-03-19T10:37:01Z

The NIXL failure seems like it might be relevant?

Signed-off-by: yewentao256 <zhyanwentao@126.com>

hmellor · 2026-03-19T14:49:28Z

vllm/distributed/kv_transfer/kv_connector/v1/example_connector.py

@@ -185,15 +185,13 @@ def inject_kv_into_layer(
                if kv_cache_attr is None:
                    continue

-                kv_cache_layer = kv_cache_attr[0]
-
                filename = self._generate_filename_debug(
                    layer_name, request.token_ids, request.mm_hashes
                )
                kv_cache = safetensors.torch.load_file(filename)["kv_cache"].cuda()
                if isinstance(attn_metadata, dict):
                    inject_kv_into_layer(
-                        kv_cache_layer,
+                        kv_cache_attr,


Could we rename this to kv_cache_layer?

Done, thanks! And also fix the previous CI issue

Signed-off-by: yewentao256 <zhyanwentao@126.com>

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com>

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com>

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>

refactor kv cache from list to element

2feba2f

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256 requested review from ApostaC, LucasWilkinson, MatthewBonanni, NickLucche, njhill, orozery, sighingnow, tdoublep and tjtanaa as code owners March 18, 2026 22:19

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 18, 2026

mergify bot added deepseek Related to DeepSeek models qwen Related to Qwen models rocm Related to AMD ROCm v1 kv-connector labels Mar 18, 2026

github-project-automation bot added this to AMD Mar 18, 2026

github-project-automation bot moved this to Todo in AMD Mar 18, 2026

gemini-code-assist bot reviewed Mar 18, 2026

View reviewed changes

fix CI

b8cc404

Signed-off-by: yewentao256 <zhyanwentao@126.com>

hmellor reviewed Mar 19, 2026

View reviewed changes

rename

989c2d3

Signed-off-by: yewentao256 <zhyanwentao@126.com>

hmellor approved these changes Mar 19, 2026

View reviewed changes

yewentao256 added 2 commits March 19, 2026 14:59

Merge branch 'main' into wentao-kv_cache-no-list

6467da1

Merge branch 'main' into wentao-kv_cache-no-list

e597daa

yewentao256 enabled auto-merge (squash) March 20, 2026 22:09

Merge branch 'main' into wentao-kv_cache-no-list

8e7fa71

vllm-bot merged commit c59a132 into main Mar 24, 2026
81 of 84 checks passed

vllm-bot deleted the wentao-kv_cache-no-list branch March 24, 2026 03:10

github-project-automation bot moved this from Todo to Done in AMD Mar 24, 2026

RhizoNymph pushed a commit to RhizoNymph/vllm that referenced this pull request Mar 26, 2026

[V0 Deprecation] Refactor kv cache from list to element (vllm-project…

87280ae

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com>

HenryTangDev pushed a commit to HenryTangMain/vllm that referenced this pull request Mar 27, 2026

[V0 Deprecation] Refactor kv cache from list to element (vllm-project…

60d257b

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com>

khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026

[V0 Deprecation] Refactor kv cache from list to element (vllm-project…

f8b7f26

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com>

nithinvc pushed a commit to nithinvc/vllm that referenced this pull request Mar 27, 2026

[V0 Deprecation] Refactor kv cache from list to element (vllm-project…

42d4435

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>

JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026

[V0 Deprecation] Refactor kv cache from list to element (vllm-project…

65bc19f

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com>

vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Mar 30, 2026

[V0 Deprecation] Refactor kv cache from list to element (vllm-project…

3caccb3

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

EricccYang pushed a commit to EricccYang/vllm that referenced this pull request Apr 1, 2026

[V0 Deprecation] Refactor kv cache from list to element (vllm-project…

97ff904

…#37487) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[V0 Deprecation] Refactor kv cache from list to element#37487

[V0 Deprecation] Refactor kv cache from list to element#37487
vllm-bot merged 6 commits intomainfrom
wentao-kv_cache-no-list

yewentao256 commented Mar 18, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

hmellor commented Mar 19, 2026

Uh oh!

hmellor Mar 19, 2026

Uh oh!

yewentao256 Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

yewentao256 commented Mar 18, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

hmellor commented Mar 19, 2026

Uh oh!

hmellor Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

yewentao256 Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yewentao256 commented Mar 18, 2026 •

edited by github-actions bot

Loading