Skip to content

[bugfix] fix the complex and potentially problematic generate_kv_idx.#5957

Merged
wangxiyuan merged 1 commit intovllm-project:mainfrom
pisceskkk:fix_cp
Jan 21, 2026
Merged

[bugfix] fix the complex and potentially problematic generate_kv_idx.#5957
wangxiyuan merged 1 commit intovllm-project:mainfrom
pisceskkk:fix_cp

Conversation

@pisceskkk
Copy link
Copy Markdown
Contributor

@pisceskkk pisceskkk commented Jan 16, 2026

What this PR does / why we need it?

In long-sequence scenarios, the chunked-prefill component may encounter dimension misalignment issues, which previously occurred during precision testing on the code_generate_lite dataset. This PR removes redundant computations and instead derives the value using existing results and straightforward calculations.

@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the logic for generating KV cache indices in chunked prefill scenarios by removing the complex generate_kv_idx function. The new approach derives the necessary indices directly from existing metadata, which simplifies the code and fixes a dimension misalignment issue. The changes look good and improve maintainability. I've identified one area where the new implementation can be made more efficient by avoiding a redundant argsort operation.

Comment thread vllm_ascend/attention/context_parallel/attention_cp.py
@weiguihua2 weiguihua2 added ready read for review ready-for-test start test by label for PR labels Jan 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
@wangxiyuan wangxiyuan merged commit 58ff465 into vllm-project:main Jan 21, 2026
20 checks passed
@pisceskkk pisceskkk deleted the fix_cp branch January 21, 2026 06:32
yiz-liu pushed a commit that referenced this pull request Jan 21, 2026
…atic generate_kv_idx. (#5955)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
ref: #5957

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
huangfeifei1995 pushed a commit to huangfeifei1995/vllm-ascend that referenced this pull request Jan 21, 2026
…vllm-project#5957)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Signed-off-by: huangning1995 <huangning12@huawei.com>
huangfeifei1995 added a commit to huangfeifei1995/vllm-ascend that referenced this pull request Jan 21, 2026
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Jan 22, 2026
…to FIA_rebase

* 'main' of https://github.com/vllm-project/vllm-ascend:
  [CI] Upgrade CANN to 8.5.0 (vllm-project#6070)
  Default enable MLAPO (vllm-project#5952)
  [Doc] Supplement PD separation parameters of DeepSeek V3.1 (vllm-project#6053)
  [Ascend] perf: optimize rope embedding with triton kernel for huge performance gain (vllm-project#5918)
  [Ops] update causal_conv1d_update (vllm-project#5984)
  [CI]Update triton ascend version in 3.2.0 (vllm-project#6067)
  [bugfix] fix the complex and potentially problematic generate_kv_idx. (vllm-project#5957)
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…vllm-project#5957)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…atic generate_kv_idx. (vllm-project#5955)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
ref: vllm-project#5957

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
…vllm-project#5957)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
tangtiangu pushed a commit to tangtiangu/jiusi-vllm-ascend that referenced this pull request Feb 24, 2026
…atic generate_kv_idx. (vllm-project#5955)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
ref: vllm-project#5957

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
tangtiangu pushed a commit to tangtiangu/jiusi-vllm-ascend that referenced this pull request Feb 24, 2026
…atic generate_kv_idx. (vllm-project#5955)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
ref: vllm-project#5957

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…vllm-project#5957)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…vllm-project#5957)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…vllm-project#5957)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…vllm-project#5957)

### What this PR does / why we need it?
In long-sequence scenarios, the chunked-prefill component may encounter
dimension misalignment issues, which previously occurred during
precision testing on the code_generate_lite dataset. This PR removes
redundant computations and instead derives the value using existing
results and straightforward calculations.
- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@2c24bc6

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:tests ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants