[KVCache][ModelRunner] Use vllm InputBatch and Blocktable#5182

Closed

MengqingCao wants to merge 2 commits intovllm-project:mainfrom

MengqingCao:rm_block_table

Collaborator

MengqingCao commented Dec 19, 2025 •

edited

Loading

What this PR does / why we need it?

This pr change to use InputBatch and BlockTable in vLLM, we just implement the get_supported_kernel_block_sizes in our attention backend, and calculate the block_size and num_block according to the kernel_block_sizes.

How was this patch tested?

test by the exsisting CI

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

Contributor

github-actions Bot commented Dec 19, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist Bot reviewed

View reviewed changes

Contributor

gemini-code-assist Bot left a comment

Code Review

This pull request is a significant refactoring that removes the custom BlockTable and NPUInputBatch implementations from vllm_ascend. Instead, it now utilizes the InputBatch from the core vLLM library. This change simplifies the codebase by removing duplicated and specialized logic for block and slot management, aligning it more closely with the upstream vLLM implementation. The concept of "hybrid blocks" has been removed and replaced with a more generic mechanism for handling kernel-specific block sizes. The changes are consistently applied across the affected files, including updates to method signatures and call sites. Overall, this is a positive change that improves maintainability and reduces custom code. I have reviewed the changes and found no issues.

MengqingCao force-pushed the rm_block_table branch from 57ea030 to aecd2ab Compare

December 19, 2025 01:28

MengqingCao changed the title ~~Rm block table~~ [KVCache][ModelRunner] Use vllm InputBatch and Blocktable

MengqingCao commented

View reviewed changes

vllm_ascend/attention/attention_v1.py Outdated

github-actions Bot added the merge-conflicts label

Contributor

github-actions Bot commented Dec 19, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

MengqingCao commented

View reviewed changes

vllm_ascend/attention/attention_v1.py

                   @staticmethod
-                  def get_supported_block_size() -> list[int]:
+                  def get_supported_kernel_block_sizes() -> list[int]:
                       return [128]

Collaborator Author

MengqingCao Dec 19, 2025

paged attention branch

MengqingCao force-pushed the rm_block_table branch from 44b2321 to 07858c8 Compare

December 19, 2025 12:28

github-actions Bot removed the merge-conflicts label

Contributor

github-actions Bot commented Dec 20, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions Bot added the merge-conflicts label

MengqingCao force-pushed the rm_block_table branch from 07858c8 to f4b66ff Compare

December 20, 2025 05:26

github-actions Bot removed the merge-conflicts label

MengqingCao added ready ready-for-test labels

github-actions Bot added the merge-conflicts label

Contributor

github-actions Bot commented Dec 22, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

MengqingCao force-pushed the rm_block_table branch from 8169764 to bd0e19c Compare

December 25, 2025 07:49

github-actions Bot added merge-conflicts and removed merge-conflicts labels

Contributor

github-actions Bot commented Dec 30, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

MengqingCao mentioned this pull request

[UT][PCP&DCP] UT for block_table.py #5032

Merged

MengqingCao force-pushed the rm_block_table branch from bd0e19c to b8ed30b Compare

January 6, 2026 03:25


          [KVCache] Use vllm InputBatch and Blocktable

c6c8fd7

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao force-pushed the rm_block_table branch from b8ed30b to c6c8fd7 Compare

January 6, 2026 03:56

github-actions Bot removed the merge-conflicts label


          lint

644624f

Signed-off-by: MengqingCao <cmq0113@163.com>

github-actions Bot added the merge-conflicts label

Contributor

github-actions Bot commented Jan 7, 2026

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Collaborator

wangxiyuan commented Apr 10, 2026

no update for long time, close this now. Feel free to reopen if it's still needed.

wangxiyuan closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-conflicts ready ready-for-test