[XPU] Set consistent default KV cache layout #24745

NickLucche · 2025-09-12T13:51:49Z

Small quality of life change to keep the way we interact with KV cache layout consistent as I explain a bit here #22735.
cc @zhenwei-intel to keep me true on the XPU-related change. PS that limit will prevent XPU from being compatible with heteroTP as of now.

cc @LucasWilkinson as this is another kv layout constraint which is good to keep in mind.

Signed-off-by: NickLucche <[email protected]>

gemini-code-assist

Code Review

This pull request refactors the handling of KV cache layout settings to be more consistent, especially for XPU which requires the 'NHD' layout. It introduces a centralized set_kv_cache_layout function, which is an improvement over direct environment variable manipulation. My review includes a suggestion to make the validation of user-provided layouts more robust. A significant part of this PR is the addition of the vllm/entrypoints/cli/coordinate.py file, which appears to be a refactoring of the server startup logic, though it's not mentioned in the PR description.

vllm/v1/attention/backends/utils.py

Signed-off-by: NickLucche <[email protected]>

NickLucche · 2025-09-12T14:00:16Z

vllm/v1/attention/backends/utils.py

+KVCacheLayoutType = Literal["NHD", "HND"]
+_KV_CACHE_LAYOUT_OVERRIDE: KVCacheLayoutType | None = None


@LucasWilkinson any other meaningful layout we want to allow here?

NickLucche · 2025-09-12T14:01:00Z

vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py


-# Supported xPUs and types of kv transfer buffer.
-# {xPU: tuple of supported kv buffer types}
-_NIXL_SUPPORTED_XPUS = {


I think xPU was a great name for a generic PU, but since Intel already claimed it this is just confusing now >.<

zhenwei-intel · 2025-09-12T14:12:15Z

LGTM

vllm/v1/attention/backends/utils.py

Signed-off-by: NickLucche <[email protected]>

jikunshang

thanks for fixing. LGTM!

Signed-off-by: NickLucche <[email protected]>

Signed-off-by: NickLucche <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Signed-off-by: NickLucche <[email protected]>

Signed-off-by: NickLucche <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

xpu layout enforce

de0094f

Signed-off-by: NickLucche <[email protected]>

NickLucche requested review from WoosukKwon, aarnphm, alexm-redhat, chaunceyjiang, comaniac, jikunshang, njhill, robertgshaw2-redhat and ywang96 as code owners September 12, 2025 13:51

mergify bot added frontend v1 labels Sep 12, 2025

cruft

1370c80

Signed-off-by: NickLucche <[email protected]>

gemini-code-assist bot reviewed Sep 12, 2025

View reviewed changes

vllm/v1/attention/backends/utils.py Outdated Show resolved Hide resolved

names and exception

9e55972

Signed-off-by: NickLucche <[email protected]>

NickLucche commented Sep 12, 2025

View reviewed changes

mgoin reviewed Sep 12, 2025

View reviewed changes

vllm/v1/attention/backends/utils.py Outdated Show resolved Hide resolved

union

c54ae75

Signed-off-by: NickLucche <[email protected]>

jikunshang approved these changes Sep 15, 2025

View reviewed changes

jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 15, 2025

jikunshang merged commit 2e41f5a into vllm-project:main Sep 15, 2025
58 checks passed

xuechendi mentioned this pull request Sep 15, 2025

[CI FIX]Fix issue introduced by upstream PR #23974 vllm-project/vllm-gaudi#172

Merged

jikunshang mentioned this pull request Sep 16, 2025

[XPU] Fix circular import error. #24927

Merged

5 tasks

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[XPU] Set consistent default KV cache layout (vllm-project#24745)

6c05536

Signed-off-by: NickLucche <[email protected]>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

[XPU] Set consistent default KV cache layout (vllm-project#24745)

3e83c46

Signed-off-by: NickLucche <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025

[XPU] Set consistent default KV cache layout (vllm-project#24745)

2205f3f

Signed-off-by: NickLucche <[email protected]>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[XPU] Set consistent default KV cache layout (vllm-project#24745)

1ca4a31

Signed-off-by: NickLucche <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[XPU] Set consistent default KV cache layout #24745

[XPU] Set consistent default KV cache layout #24745

Uh oh!

NickLucche commented Sep 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

NickLucche Sep 12, 2025

Uh oh!

NickLucche Sep 12, 2025

Uh oh!

zhenwei-intel commented Sep 12, 2025

Uh oh!

Uh oh!

jikunshang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		KVCacheLayoutType = Literal["NHD", "HND"]
		_KV_CACHE_LAYOUT_OVERRIDE: KVCacheLayoutType \| None = None

Uh oh!

[XPU] Set consistent default KV cache layout #24745

[XPU] Set consistent default KV cache layout #24745

Uh oh!

Conversation

NickLucche commented Sep 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

NickLucche Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

NickLucche Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

zhenwei-intel commented Sep 12, 2025

Uh oh!

Uh oh!

jikunshang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants