[CustomOp] Register AscendApplyRotaryEmb CustomOp and remove related patch by shen-shanshan · Pull Request #4667 · vllm-project/vllm-ascend

shen-shanshan · 2025-12-03T08:37:27Z

What this PR does / why we need it?

Following vllm-project/vllm#29873, register AscendApplyRotaryEmb CustomOp and remove related patch.

Does this PR introduce any user-facing change?

How was this patch tested?

✅ Test Qwen2.5-VL

Run:

vllm serve /root/.cache/modelscope/hub/models/Qwen/Qwen2.5-VL-7B-Instruct \
--max_model_len 16384

Output:

{"id":"chatcmpl-b02c1ff3415d2462","object":"chat.completion","created":1766129265,"model":"/root/.cache/modelscope/hub/models/Qwen/Qwen2.5-VL-7B-In struct","choices":[{"index":0,"message":{"role":"assistant","content":"The text in the illustration is \"TONGYI Qwen.\" The word \"TONGYI\" is writ  ten in blue, and \"Qwen\" is written in gray. The text appears to be part of a logo or branding design.","refusal":null,"annotations":null,"audio":   null,"function_call":null,"tool_calls":[],"reasoning":null,"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"tok    en_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":78,"total_tokens":129,"completion_tokens":51,"prompt_tokens_d

✅ Test Qwen3-VL

Run:

vllm serve /root/.cache/modelscope/hub/models/Qwen/Qwen3-VL-8B-Instruct \
--max_model_len 16384

Output:

{"id":"chatcmpl-a3a7de5a900a9321","object":"chat.completion","created":1766129586,"model":"/root/.cache/modelscope/hub/models/Qwen/Qwen3-VL-8B-Instruct","choices":[{"index":0,"message":{"role":"assistant","content":"The text in the illustration is **“TONGYI Qwen”**.\n\n### How it looks:\n- **“TONGYI”** is written in **uppercase letters** in a **bold, modern sans-serif font**, colored **blue**.\n- **“Qwen”** is written in **lowercase letters** in a **slightly thinner, elegant sans-serif font**, colored **dark gray**.\n- The two lines of text are stacked vertically, with “TONG","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null,"reasoning_content":null},"logprobs":null,"finish_reason":"length","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":112,"total_tokens":212,"completion_tokens":100,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

github-actions · 2025-12-03T08:37:36Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces a custom operator AscendApplyRotaryEmb for applying rotary embeddings on Ascend hardware and registers it. The implementation refactors existing logic from patch_qwen2_5_vl.py. However, I've found a critical bug in the new AscendApplyRotaryEmb implementation where incorrect tensor shape manipulation will lead to a runtime error. The logic for preparing cos and sin tensors was copied from an implementation for a different model and is not compatible with the input tensor shapes for Qwen2.5-VL.

github-actions · 2025-12-04T14:36:51Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2025-12-22T06:37:18Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wangxiyuan · 2025-12-22T10:43:05Z

https://github.com/vllm-project/vllm-ascend/blob/main/vllm_ascend/patch/__init__.py should be updated as well. And delete related comment by this #5196 as well.

github-actions · 2025-12-22T10:49:14Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: shen-shanshan <467638484@qq.com>

shen-shanshan · 2025-12-22T14:35:53Z

CC @wangxiyuan All CI passed.

Signed-off-by: shen-shanshan <467638484@qq.com>

…patch (vllm-project#4667) ### What this PR does / why we need it? Following vllm-project/vllm#29873, register `AscendApplyRotaryEmb` CustomOp and remove related patch. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? #### ✅ Test Qwen2.5-VL Run: ```bash vllm serve /root/.cache/modelscope/hub/models/Qwen/Qwen2.5-VL-7B-Instruct \ --max_model_len 16384 ``` Output: ``` {"id":"chatcmpl-b02c1ff3415d2462","object":"chat.completion","created":1766129265,"model":"/root/.cache/modelscope/hub/models/Qwen/Qwen2.5-VL-7B-In struct","choices":[{"index":0,"message":{"role":"assistant","content":"The text in the illustration is \"TONGYI Qwen.\" The word \"TONGYI\" is writ ten in blue, and \"Qwen\" is written in gray. The text appears to be part of a logo or branding design.","refusal":null,"annotations":null,"audio": null,"function_call":null,"tool_calls":[],"reasoning":null,"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"tok en_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":78,"total_tokens":129,"completion_tokens":51,"prompt_tokens_d ``` #### ✅ Test Qwen3-VL Run: ```bash vllm serve /root/.cache/modelscope/hub/models/Qwen/Qwen3-VL-8B-Instruct \ --max_model_len 16384 ``` Output: ``` {"id":"chatcmpl-a3a7de5a900a9321","object":"chat.completion","created":1766129586,"model":"/root/.cache/modelscope/hub/models/Qwen/Qwen3-VL-8B-Instruct","choices":[{"index":0,"message":{"role":"assistant","content":"The text in the illustration is **“TONGYI Qwen”**.\n\n### How it looks:\n- **“TONGYI”** is written in **uppercase letters** in a **bold, modern sans-serif font**, colored **blue**.\n- **“Qwen”** is written in **lowercase letters** in a **slightly thinner, elegant sans-serif font**, colored **dark gray**.\n- The two lines of text are stacked vertically, with “TONG","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null,"reasoning_content":null},"logprobs":null,"finish_reason":"length","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":112,"total_tokens":212,"completion_tokens":100,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null} ``` - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: shen-shanshan <467638484@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

github-actions bot added module:ops module:core labels Dec 3, 2025

shen-shanshan mentioned this pull request Dec 3, 2025

[RFC]: Remove VL Modeling Files #4084

Closed

17 tasks

gemini-code-assist bot reviewed Dec 3, 2025

View reviewed changes

Comment thread vllm_ascend/ops/rotary_embedding.py Outdated

shen-shanshan mentioned this pull request Dec 3, 2025

[CustomOp] Extract ApplyRotaryEmb as CustomOp and unify the dispatch logic vllm-project/vllm#29873

Merged

9 tasks

Sparkheart reviewed Dec 3, 2025

View reviewed changes

Comment thread vllm_ascend/patch/worker/patch_qwen2_5_vl.py Outdated

github-actions bot added the merge-conflicts label Dec 4, 2025

shen-shanshan force-pushed the vit branch from c94c66e to da79cee Compare December 9, 2025 09:35

shen-shanshan removed the merge-conflicts label Dec 9, 2025

shen-shanshan force-pushed the vit branch from 1c28bc1 to 11331b6 Compare December 19, 2025 07:36

shen-shanshan changed the title ~~[CustomOp] Implement ApplyRotaryEmb CustomOp and register it~~ [CustomOp] Register ApplyRotaryEmb CustomOp and remove related patch Dec 19, 2025

shen-shanshan changed the title ~~[CustomOp] Register ApplyRotaryEmb CustomOp and remove related patch~~ [CustomOp] Register AscendApplyRotaryEmb CustomOp and remove related patch Dec 19, 2025

weijinqian0 approved these changes Dec 19, 2025

View reviewed changes

github-actions bot added the merge-conflicts label Dec 22, 2025

shen-shanshan force-pushed the vit branch from e26caf1 to 23b2128 Compare December 22, 2025 06:49

shen-shanshan added ready read for review ready-for-test start test by label for PR and removed merge-conflicts labels Dec 22, 2025

github-actions bot added the merge-conflicts label Dec 22, 2025

shen-shanshan added 6 commits December 22, 2025 11:23

register apply_rotary_emb custom op

fbcdf52

Signed-off-by: shen-shanshan <467638484@qq.com>

update

7a63720

Signed-off-by: shen-shanshan <467638484@qq.com>

minor fix

5766e3d

Signed-off-by: shen-shanshan <467638484@qq.com>

update

a046e09

Signed-off-by: shen-shanshan <467638484@qq.com>

remove patch

26ce8f5

Signed-off-by: shen-shanshan <467638484@qq.com>

sync

d921b76

Signed-off-by: shen-shanshan <467638484@qq.com>

fix

ddcc875

Signed-off-by: shen-shanshan <467638484@qq.com>

shen-shanshan force-pushed the vit branch from 23b2128 to ddcc875 Compare December 22, 2025 11:24

shen-shanshan added ready read for review ready-for-test start test by label for PR and removed ready read for review ready-for-test start test by label for PR merge-conflicts labels Dec 22, 2025

ApsarasX mentioned this pull request Dec 22, 2025

[ops]optimize forward native for mrope #5244

Closed

update comment

1baa226

Signed-off-by: shen-shanshan <467638484@qq.com>

wangxiyuan approved these changes Dec 23, 2025

View reviewed changes

wangxiyuan merged commit 6c47853 into vllm-project:main Dec 23, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CustomOp] Register AscendApplyRotaryEmb CustomOp and remove related patch#4667

[CustomOp] Register AscendApplyRotaryEmb CustomOp and remove related patch#4667
wangxiyuan merged 8 commits intovllm-project:mainfrom
shen-shanshan:vit

shen-shanshan commented Dec 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 4, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

wangxiyuan commented Dec 22, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

shen-shanshan commented Dec 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

shen-shanshan commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

✅ Test Qwen2.5-VL

✅ Test Qwen3-VL

Uh oh!

github-actions bot commented Dec 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 4, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

wangxiyuan commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

shen-shanshan commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shen-shanshan commented Dec 3, 2025 •

edited

Loading

wangxiyuan commented Dec 22, 2025 •

edited

Loading

shen-shanshan commented Dec 22, 2025 •

edited

Loading