[Refactor] remove some metadata variables in attention_v1. by weijinqian0 · Pull Request #5160 · vllm-project/vllm-ascend

weijinqian0 · 2025-12-18T07:47:08Z

Reason:

The metadata data class contains an excessive number of variables. We will inherit the metadata of the community and simultaneously remove some variables that are no longer needed at present.

Todo:

remove attn_state partly.

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

gemini-code-assist

Code Review

This pull request refactors attention metadata by removing several redundant variables like query_lens, query_start_loc_list, and is_only_prefill. The changes are mostly clean and improve code clarity. However, I found a critical issue where query_lens is re-calculated on-the-fly in xlite.py using attn_metadata.query_start_loc_cpu, but this attribute is not present in the AscendMetadata object, which will lead to a runtime error. I've provided a detailed comment with a suggested fix.

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

github-actions · 2025-12-18T08:19:37Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: (52 commits) [Doc]Add the user_guide doc file regarding fine-grained TP. (vllm-project#5084) [pref] qwen3_next add triton ops : fused_sigmoid_gating_delta_rule_update (vllm-project#4818) [Feature] Add token mask for DispatchGmmCombineDecode operator (vllm-project#5171) [CI] Improve CI (vllm-project#5078) [Refactor] remove some metadata variables in attention_v1. (vllm-project#5160) Add Qwen3-VL-235B-A22B-Instruct tutorials (vllm-project#5167) [Doc] Add a perf tune section (vllm-project#5127) [Image] Refactor image build (vllm-project#5175) [refactor] refactor weight trans nz and transpose (vllm-project#4878) [BugFix]Fix precision issue for LoRA feature (vllm-project#4141) 【Doc】Deepseekv3.1/R1 doc enhancement (vllm-project#4827) support basic long_seq feature st (vllm-project#5140) [Bugfix] install trition for test_custom_op (vllm-project#5112) [2/N][Pangu][MoE] Remove Pangu Related Code (vllm-project#5130) [bugfix] Use FUSED_MC2 MoE comm path for the op `dispatch_ffn_combine` (vllm-project#5156) [BugFix] Fix top_p,top_k issue with EAGLE and add top_p,top_k in EAGLE e2e (vllm-project#5131) [Doc][P/D] Fix MooncakeConnector's name (vllm-project#5172) [Bugfix] Fix in_profile_run in mtp_proposer dummy_run (vllm-project#5165) [Doc] Refact benchmark doc (vllm-project#5173) [Nightly] Avoid max_model_len being smaller than the decoder prompt to prevent single-node-accuray-tests from failing (vllm-project#5174) ... Signed-off-by: 白永斌 <baiyongbin3@h-partners.com>

…ect#5160) RFC: vllm-project#4629 Reason: The metadata data class contains an excessive number of variables. We will inherit the metadata of the community and simultaneously remove some variables that are no longer needed at present. Todo: 1. remove attn_state partly. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>

wjunLu · 2025-12-23T03:18:05Z

I think those changes have not been tested, because there is no query_start_loc_cpu in AscendMetadata class.

But attn_metadata.query_start_loc_cpu is called, this will occur the following errors:

(EngineCore_DP0 pid=12553)   File "/vllm-workspace/vllm-ascend/vllm_ascend/xlite/xlite.py", line 260, in __call__
(EngineCore_DP0 pid=12553)     query_lens = attn_metadata.query_start_loc_cpu[
(EngineCore_DP0 pid=12553)                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=12553) AttributeError: 'AscendMetadata' object has no attribute 'query_start_loc_cpu'

Env:

vllm: 0.13.0+empty
vllm-ascend: 0.12.0rc2.dev138+g55beac9c9.d20251223

…ect#5160) RFC: vllm-project#4629 Reason: The metadata data class contains an excessive number of variables. We will inherit the metadata of the community and simultaneously remove some variables that are no longer needed at present. Todo: 1. remove attn_state partly. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

[Refactor] remove some metadata variables in attention_v1.

93750a9

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

gemini-code-assist bot reviewed Dec 18, 2025

View reviewed changes

Comment thread vllm_ascend/xlite/xlite.py

[Refactor] remove some metadata variables in attention_v1.

2a3e2ec

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

weijinqian0 mentioned this pull request Dec 18, 2025

[RFC]: Refactor Attention module #4629

Closed

weijinqian0 and others added 2 commits December 18, 2025 16:09

Merge branch 'main' into refactor_attention_remove_metadata

242f378

[Refactor] remove some metadata variables in attention_v1.

27f77f3

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

weijinqian_v1 and others added 12 commits December 18, 2025 16:54

[Refactor] remove some metadata variables in attention_v1.

3836ff2

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

6fc7fb2

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

673faba

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

0fe1b5d

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

ee775fa

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

d080a87

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

8bfa0ab

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

ffb4f3f

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

Merge branch 'main' into refactor_attention_remove_metadata

aa60229

[Refactor] remove some metadata variables in attention_v1.

bef48f3

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

22aab4d

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

[Refactor] remove some metadata variables in attention_v1.

8fe992d

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>

weijinqian0 added ready read for review ready-for-test start test by label for PR labels Dec 19, 2025

yiz-liu approved these changes Dec 19, 2025

View reviewed changes

wangxiyuan merged commit 35ad11b into vllm-project:main Dec 19, 2025
54 checks passed

weijinqian0 deleted the refactor_attention_remove_metadata branch January 7, 2026 11:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] remove some metadata variables in attention_v1.#5160

[Refactor] remove some metadata variables in attention_v1.#5160
wangxiyuan merged 16 commits intovllm-project:mainfrom
weijinqian0:refactor_attention_remove_metadata

weijinqian0 commented Dec 18, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

Uh oh!

wjunLu commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

weijinqian0 commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

Uh oh!

wjunLu commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

weijinqian0 commented Dec 18, 2025 •

edited

Loading