[bugfix] fix the complex and potentially problematic generate_kv_idx. by pisceskkk · Pull Request #5957 · vllm-project/vllm-ascend

pisceskkk · 2026-01-16T09:10:00Z

What this PR does / why we need it?

In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations.

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@2c24bc6

github-actions · 2026-01-16T09:10:14Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request refactors the logic for generating KV cache indices in chunked prefill scenarios by removing the complex generate_kv_idx function. The new approach derives the necessary indices directly from existing metadata, which simplifies the code and fixes a dimension misalignment issue. The changes look good and improve maintainability. I've identified one area where the new implementation can be made more efficient by avoiding a redundant argsort operation.

github-actions · 2026-01-19T01:02:24Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

…atic generate_kv_idx. (#5955) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. ref: #5957 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

…vllm-project#5957) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Signed-off-by: huangning1995 <huangning12@huawei.com>

…_kv_idx. (vllm-project#5957)" This reverts commit 3d916f4.

…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: [CI] Upgrade CANN to 8.5.0 (vllm-project#6070) Default enable MLAPO (vllm-project#5952) [Doc] Supplement PD separation parameters of DeepSeek V3.1 (vllm-project#6053) [Ascend] perf: optimize rope embedding with triton kernel for huge performance gain (vllm-project#5918) [Ops] update causal_conv1d_update (vllm-project#5984) [CI]Update triton ascend version in 3.2.0 (vllm-project#6067) [bugfix] fix the complex and potentially problematic generate_kv_idx. (vllm-project#5957)

…vllm-project#5957) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

…atic generate_kv_idx. (vllm-project#5955) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. ref: vllm-project#5957 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

…vllm-project#5957) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

…atic generate_kv_idx. (vllm-project#5955) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. ref: vllm-project#5957 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

…vllm-project#5957) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…vllm-project#5957) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

…vllm-project#5957) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…vllm-project#5957) ### What this PR does / why we need it? In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

pisceskkk requested review from MengqingCao, wangxiyuan and weijinqian0 as code owners January 16, 2026 09:10

github-actions bot added the module:tests label Jan 16, 2026

gemini-code-assist bot reviewed Jan 16, 2026

View reviewed changes

Comment thread vllm_ascend/attention/context_parallel/attention_cp.py

weiguihua2 added ready read for review ready-for-test start test by label for PR labels Jan 16, 2026

github-actions bot added the merge-conflicts label Jan 19, 2026

pisceskkk force-pushed the fix_cp branch from c462b78 to f7f0a5e Compare January 19, 2026 01:20

github-actions bot removed the merge-conflicts label Jan 19, 2026

pisceskkk force-pushed the fix_cp branch 6 times, most recently from eb48feb to 871ab62 Compare January 21, 2026 01:19

[bugfix] fix the complex and potentially problematic generate_kv_idx.

871ab62

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

wangxiyuan approved these changes Jan 21, 2026

View reviewed changes

wangxiyuan merged commit 58ff465 into vllm-project:main Jan 21, 2026
20 checks passed

pisceskkk deleted the fix_cp branch January 21, 2026 06:32

pisceskkk mentioned this pull request Jan 21, 2026

[0.13.0][cherry-pick][bugfix] fix the complex and potentially problematic generate_kv_idx. #5955

Merged

huangfeifei1995 added a commit to huangfeifei1995/vllm-ascend that referenced this pull request Jan 21, 2026

Revert "[bugfix] fix the complex and potentially problematic generate…

dcd4a8f

…_kv_idx. (vllm-project#5957)" This reverts commit 3d916f4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bugfix] fix the complex and potentially problematic generate_kv_idx.#5957

[bugfix] fix the complex and potentially problematic generate_kv_idx.#5957
wangxiyuan merged 1 commit intovllm-project:mainfrom
pisceskkk:fix_cp

pisceskkk commented Jan 16, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jan 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pisceskkk commented Jan 16, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Uh oh!

github-actions bot commented Jan 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pisceskkk commented Jan 16, 2026 •

edited by github-actions bot

Loading