Conversation
Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request addresses a precision issue with the mrope fusion operator on Ascend NPUs. The changes involve removing a workaround that disabled the operator on x86 platforms, indicating the underlying precision problem has been resolved for the specific qwen3vl model configuration. Additionally, an unused import is cleaned up, and a call to .contiguous() is added for the positions tensor before passing it to the npu_mrope operator. This is a good defensive measure to ensure correctness by providing a contiguous tensor as expected by the kernel. The changes are logical and well-contained.
Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>
|
please cherry-pick this PR to branch v0.11.0-dev |
### What this PR does / why we need it? Qwen2.5-VL mrope precision problem would been solved once this pr is merged ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: 李少鹏 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Qwen2.5-VL mrope precision problem would been solved once this pr is merged No Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: 李少鹏 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com>
Qwen2.5-VL mrope precision problem would been solved once this pr is merged No Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Co-authored-by: 李少鹏 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Qwen2.5-VL mrope precision problem would been solved once this pr is merged No Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Co-authored-by: 李少鹏 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com>
### What this PR does / why we need it? Qwen2.5-VL mrope precision problem would been solved once this pr is merged ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: shaopeng-666 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it? Qwen2.5-VL mrope precision problem would been solved once this pr is merged ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: 李少鹏 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
) ### What this PR does / why we need it? Qwen2.5-VL mrope precision problem would been solved once this pr is merged ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: shaopeng-666 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it? Qwen2.5-VL mrope precision problem would been solved once this pr is merged ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: 李少鹏 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it? Qwen2.5-VL mrope precision problem would been solved once this pr is merged ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: 李少鹏 <lishaopeng21@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
What this PR does / why we need it?
Qwen2.5-VL mrope precision problem would been solved once this pr is merged
Does this PR introduce any user-facing change?
No
How was this patch tested?
Test on G8600 with textVQA dataset