Skip to content

Enable support for prefill side kv_layout and block_size update#867

Merged
xuechendi merged 4 commits intovllm-project:mainfrom
yeonsily:dev/prefill_kv_layout
Jan 28, 2026
Merged

Enable support for prefill side kv_layout and block_size update#867
xuechendi merged 4 commits intovllm-project:mainfrom
yeonsily:dev/prefill_kv_layout

Conversation

@yeonsily
Copy link
Copy Markdown
Contributor

  1. update example to support prefill HND and agreed_block_size
  2. enable prefill side kv_layout and block_size update

Port vllm-project/vllm#30448 to vllm-gaudi

enable support for prefill side kv_layout and block_size update
1. update example to support prefill HND and agreed_block_size
2. enable prefill side kv_layout and block_size update

Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
@xuechendi
Copy link
Copy Markdown
Collaborator

@yeonsily , I would suggest create a new nixl_connector file only for Gaudi to CUDA scenario instead of override current hpu_nixl_connector.py one.
The reason is upstream change is too fast, override so many function will fail in no time

Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@xuechendi
Copy link
Copy Markdown
Collaborator

Did you missed the switch for flag check and which nixl_connector.py to import

@yeonsily
Copy link
Copy Markdown
Contributor Author

@xuechendi The change in hpu_nixl_connector is common regardless of heterogeneous run. And the ones in hetero_hpu_nixl_connector is extra for hetero.

@xuechendi
Copy link
Copy Markdown
Collaborator

@xuechendi The change in hpu_nixl_connector is common regardless of heterogeneous run. And the ones in hetero_hpu_nixl_connector is extra for hetero.

I see how to use VLLM_HPU_HETERO_KV_LAYOUT, was thinking to simply use VLLM_HPU_HETERO_KV_LAYOUT which file to import. But your approach also works

@yeonsily
Copy link
Copy Markdown
Contributor Author

@xuechendi @michalkuligowski Thank you for your review! I see CI is failed but don't think it's from my side as my change won't trigger without the flag. It seems CI is broken now. Is there any way to re-trigger CI without commit any change?

@yeonsily yeonsily closed this Jan 27, 2026
@yeonsily yeonsily reopened this Jan 27, 2026
Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
6218034dd7f9a56596e4fd8c8c8fc1d8011ed9c2

@xuechendi xuechendi merged commit d54f4c2 into vllm-project:main Jan 28, 2026
53 checks passed
testdig pushed a commit to testdig/vllm-gaudi-fork that referenced this pull request Jan 29, 2026
…-project#867)

1. update example to support prefill HND and agreed_block_size
2. enable prefill side kv_layout and block_size update

Port vllm-project/vllm#30448 to vllm-gaudi

---------

Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
Signed-off-by: Wang, Zheng W <zheng.w.wang@intel.com>
yiliu30 pushed a commit to yiliu30/vllm-gaudi that referenced this pull request Feb 4, 2026
…-project#867)

1. update example to support prefill HND and agreed_block_size
2. enable prefill side kv_layout and block_size update

Port vllm-project/vllm#30448 to vllm-gaudi

---------

Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
slokesha pushed a commit to libinta/vllm-gaudi that referenced this pull request Feb 9, 2026
…-project#867)

1. update example to support prefill HND and agreed_block_size
2. enable prefill side kv_layout and block_size update

Port vllm-project/vllm#30448 to vllm-gaudi

---------

Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
Signed-off-by: slokesha <slokeshappa@habana.ai>
adobrzyn pushed a commit that referenced this pull request Mar 31, 2026
1. update example to support prefill HND and agreed_block_size
2. enable prefill side kv_layout and block_size update

Port vllm-project/vllm#30448 to vllm-gaudi

---------

Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants