Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a refactoring in the multi-modal encoder attention mechanism by extracting the QKV reshaping logic into a new method. It also updates the configuration logic in the NPU platform. My review focuses on improving code clarity and removing a redundant state mutation. I've suggested removing an unnecessary attribute assignment in the new reshape_qkv_to_3d method to eliminate side effects and improve maintainability. Additionally, I've pointed out an unconventional line break in platform.py that should be corrected for better readability.
| query = query.view(bsz * q_len, self.num_heads, self.head_size) | ||
| key = key.view(bsz * kv_len, self.num_kv_heads, self.head_size) | ||
| value = value.view(bsz * kv_len, self.num_kv_heads, self.head_size) | ||
| self.num_queries_per_kv = self.num_heads // self.num_kv_heads |
There was a problem hiding this comment.
The self.num_queries_per_kv attribute is already initialized in the __init__ method of the Attention superclass. Re-assigning it here on every forward pass is redundant and introduces an unnecessary side effect. This can make the code harder to reason about and maintain. It's best practice to initialize such constant attributes once in the constructor and avoid mutating them in forward passes.
| data_parallel_size=vllm_config.parallel_config. | ||
| data_parallel_size, |
There was a problem hiding this comment.
The line break in the data_parallel_size argument is unconventional and harms readability. It appears to be an accidental formatting issue. For better code clarity and maintainability, it's best to keep the attribute access on a single line.
| data_parallel_size=vllm_config.parallel_config. | |
| data_parallel_size, | |
| data_parallel_size=vllm_config.parallel_config.data_parallel_size, |
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: zxwang <1476209578@qq.com>
a41e87c to
ef98a25
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
What this PR does / why we need it?
fix the break from vllm-project/vllm#30836 and update vllm version to 1223
Does this PR introduce any user-facing change?
How was this patch tested?