Skip to content

Update 12 23#5268

Closed
Toneymiller wants to merge 2 commits intovllm-project:mainfrom
leo-pony:update_12_23
Closed

Update 12 23#5268
Toneymiller wants to merge 2 commits intovllm-project:mainfrom
leo-pony:update_12_23

Conversation

@Toneymiller
Copy link
Contributor

@Toneymiller Toneymiller commented Dec 23, 2025

What this PR does / why we need it?

fix the break from vllm-project/vllm#30836 and update vllm version to 1223

Does this PR introduce any user-facing change?

How was this patch tested?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a refactoring in the multi-modal encoder attention mechanism by extracting the QKV reshaping logic into a new method. It also updates the configuration logic in the NPU platform. My review focuses on improving code clarity and removing a redundant state mutation. I've suggested removing an unnecessary attribute assignment in the new reshape_qkv_to_3d method to eliminate side effects and improve maintainability. Additionally, I've pointed out an unconventional line break in platform.py that should be corrected for better readability.

query = query.view(bsz * q_len, self.num_heads, self.head_size)
key = key.view(bsz * kv_len, self.num_kv_heads, self.head_size)
value = value.view(bsz * kv_len, self.num_kv_heads, self.head_size)
self.num_queries_per_kv = self.num_heads // self.num_kv_heads
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The self.num_queries_per_kv attribute is already initialized in the __init__ method of the Attention superclass. Re-assigning it here on every forward pass is redundant and introduces an unnecessary side effect. This can make the code harder to reason about and maintain. It's best practice to initialize such constant attributes once in the constructor and avoid mutating them in forward passes.

Comment on lines +237 to +238
data_parallel_size=vllm_config.parallel_config.
data_parallel_size,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The line break in the data_parallel_size argument is unconventional and harms readability. It appears to be an accidental formatting issue. For better code clarity and maintainability, it's best to keep the attribute access on a single line.

Suggested change
data_parallel_size=vllm_config.parallel_config.
data_parallel_size,
data_parallel_size=vllm_config.parallel_config.data_parallel_size,

@github-actions
Copy link
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: zxwang <1476209578@qq.com>
Signed-off-by: zxwang <1476209578@qq.com>
@leo-pony leo-pony added ready read for review ready-for-test start test by label for PR labels Dec 24, 2025
@github-actions
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation merge-conflicts module:core module:ops ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants