fix qwen3vl ci by shaopeng-666 · Pull Request #4536 · vllm-project/vllm-ascend

shaopeng-666 · 2025-11-28T08:07:33Z

What this PR does / why we need it?

Follow the interface changes from the upstream vllm repository ‘
[Core][MM] Add mechanism to configure multimodal fields which should stay on CPU (vllm-project/vllm#28168)

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>

github-actions · 2025-11-28T08:07:42Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request updates the qwen3vl model to support torch.Tensor as an input type for grid_thw, which was previously only a list. While the change is functionally correct, I've identified a performance issue in the implementation. The conversion from a torch.Tensor to a NumPy array and back to a tensor is inefficient and can be optimized. I've provided a suggestion to refactor this logic for better performance.

gemini-code-assist · 2025-11-28T08:09:39Z

+        if isinstance(grid_thw, list):
+            grid_thw_list = grid_thw
+            grid_thw = np.array(grid_thw, dtype=np.int32)
+        else:
+            grid_thw_list = grid_thw.tolist()
+            grid_thw = grid_thw.numpy()
+
+        pos_embeds = self.fast_pos_embed_interpolate(grid_thw_list)
        hidden_states = hidden_states + pos_embeds
-        rotary_pos_emb = self.rot_pos_emb(grid_thw)
+        rotary_pos_emb = self.rot_pos_emb(grid_thw_list)
        grid_thw_tensor = torch.tensor(grid_thw,
                                       device=self.device,
                                       dtype=torch.int32)


The current implementation for handling grid_thw when it is a torch.Tensor is inefficient. It converts the tensor to a NumPy array using .numpy() (which can cause a GPU-to-CPU data transfer) and then converts it back to a tensor using torch.tensor(). This can be optimized by handling the list and torch.Tensor cases separately to avoid unnecessary conversions.

if isinstance(grid_thw, list): grid_thw_list = grid_thw grid_thw_tensor = torch.tensor(grid_thw, device=self.device, dtype=torch.int32) else: grid_thw_list = grid_thw.tolist() grid_thw_tensor = grid_thw.to(device=self.device, dtype=torch.int32) pos_embeds = self.fast_pos_embed_interpolate(grid_thw_list) hidden_states = hidden_states + pos_embeds rotary_pos_emb = self.rot_pos_emb(grid_thw_list)

wangxiyuan · 2025-11-28T08:16:32Z

@shen-shanshan

shen-shanshan · 2025-11-29T07:48:26Z

LGTM.
This PR just sync the changes in vllm-project/vllm#28168 to keep interface compatible.
But the while Qwen3-VL ViT may be removed directly to avoid maintaince.

Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>

github-actions · 2025-12-01T23:37:32Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wangxiyuan · 2026-01-05T06:50:50Z

We won't allow to change model files any more. Please use another way or contribute to vLLM directly.

fix qwen3vl ci

b401b6b

Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>

gemini-code-assist bot reviewed Nov 28, 2025

View reviewed changes

MengqingCao added ready read for review ready-for-test start test by label for PR labels Nov 28, 2025

fix qwen3vl ci

f1f559e

Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>

github-actions bot added module:tests merge-conflicts labels Nov 29, 2025

wangxiyuan closed this Jan 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix qwen3vl ci#4536

fix qwen3vl ci#4536
shaopeng-666 wants to merge 2 commits intovllm-project:mainfrom
shaopeng-666:fix_ci_new

shaopeng-666 commented Nov 28, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 28, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Uh oh!

wangxiyuan commented Nov 28, 2025

Uh oh!

shen-shanshan commented Nov 29, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

wangxiyuan commented Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

shaopeng-666 commented Nov 28, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 28, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan commented Nov 28, 2025

Uh oh!

shen-shanshan commented Nov 29, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

wangxiyuan commented Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shaopeng-666 commented Nov 28, 2025 •

edited by github-actions bot

Loading