[XPU] Add deepseek_scaling_rope fused kernel by yitingw1 · Pull Request #36612 · vllm-project/vllm

yitingw1 · 2026-03-10T08:45:52Z

Purpose

[XPU] Add the usage of the fused deepseek_scaling_rope kernel in PR for DeepseekScalingRotaryEmbedding. Previously, it ran with forward_native().

Test Plan

Test Result

Verified lm_eval with 4xBMG for DeepSeek-V2-Lite-Chat functionality locally.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: yitingw1 <yiting.wang@intel.com>

github-actions · 2026-03-10T08:56:32Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This pull request introduces a fused kernel for deepseek_scaling_rope on XPU to improve performance. The changes involve registering a new custom PyTorch operation and integrating it into the DeepseekScalingRotaryEmbedding layer. My review has identified a few critical issues related to thread safety, correctness of the operation registration, and potential runtime errors due to an uninitialized cache. I've also pointed out some type hint inconsistencies that could affect static analysis and torch.compile.

vllm/_xpu_ops.py

vllm/model_executor/layers/rotary_embedding/deepseek_scaling_rope.py

vllm/_xpu_ops.py

jikunshang

LGTM.

Copilot

Pull request overview

This PR adds a fused XPU kernel for DeepseekScalingRotaryEmbedding, replacing the previous forward_native() fallback with a dedicated forward_xpu() method that calls a custom registered op backed by torch.ops._xpu_C.deepseek_scaling_rope.

Changes:

Adds forward_xpu() method to DeepseekScalingRotaryEmbedding that delegates to the new fused XPU op.
Registers xpu_ops_deepseek_scaling_rope as a custom torch op with a fake implementation for tracing.
Introduces a module-level _OPS_REGISTERED guard and register_ops_once() to prevent duplicate registration.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`vllm/model_executor/layers/rotary_embedding/deepseek_scaling_rope.py`	Adds `forward_xpu()` method wiring the class to the new fused kernel op
`vllm/_xpu_ops.py`	Implements and registers the `xpu_ops_deepseek_scaling_rope` custom op with impl and fake functions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vllm/_xpu_ops.py

Signed-off-by: yitingw1 <yiting.wang@intel.com>

jikunshang · 2026-03-11T11:28:10Z

Please resolve conflicts, thanks.

mergify · 2026-03-11T11:41:37Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @yitingw1.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…seek_rope Signed-off-by: yitingw1 <yiting.wang@intel.com>

yitingw1 · 2026-03-12T01:44:09Z

Please resolve conflicts, thanks.

Done.

…seek_rope Signed-off-by: yitingw1 <yiting.wang@intel.com>

Signed-off-by: yitingw1 <yiting.wang@intel.com>

Signed-off-by: yitingw1 <yiting.wang@intel.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

[XPU] Add deepseek_scaling_rope fused kernel

f69ed12

Signed-off-by: yitingw1 <yiting.wang@intel.com>

mergify bot added the deepseek Related to DeepSeek models label Mar 10, 2026

jikunshang requested a review from Copilot March 10, 2026 09:26

gemini-code-assist bot reviewed Mar 10, 2026

View reviewed changes

vllm/_xpu_ops.py Outdated Show resolved Hide resolved

vllm/model_executor/layers/rotary_embedding/deepseek_scaling_rope.py Outdated Show resolved Hide resolved

vllm/_xpu_ops.py Show resolved Hide resolved

vllm/_xpu_ops.py Show resolved Hide resolved

jikunshang approved these changes Mar 10, 2026

View reviewed changes

Copilot AI reviewed Mar 10, 2026

View reviewed changes

vllm/_xpu_ops.py Show resolved Hide resolved

vllm/_xpu_ops.py Show resolved Hide resolved

Copilot started reviewing on behalf of jikunshang March 10, 2026 10:04 View session

yitingw1 added 2 commits March 10, 2026 19:10

Fix

aef8de4

Signed-off-by: yitingw1 <yiting.wang@intel.com>

Merge branch 'main' into dev/xpu_customop_deepseek_rope

019bc9b

mergify bot added the needs-rebase label Mar 11, 2026

Merge remote-tracking branch 'origin/main' into dev/xpu_customop_deep…

ad886db

…seek_rope Signed-off-by: yitingw1 <yiting.wang@intel.com>

mergify bot removed the needs-rebase label Mar 12, 2026

jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 12, 2026

yitingw1 added 4 commits March 12, 2026 00:40

Merge remote-tracking branch 'origin/main' into dev/xpu_customop_deep…

acf1b85

…seek_rope Signed-off-by: yitingw1 <yiting.wang@intel.com>

fix2

28048e9

Signed-off-by: yitingw1 <yiting.wang@intel.com>

Merge branch 'main' into dev/xpu_customop_deepseek_rope

49edea9

Merge branch 'main' into dev/xpu_customop_deepseek_rope

2a9f76e

jikunshang merged commit 68e1b71 into vllm-project:main Mar 16, 2026
55 checks passed

Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026

[XPU] Add deepseek_scaling_rope fused kernel (vllm-project#36612)

5114650

Signed-off-by: yitingw1 <yiting.wang@intel.com>

wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026

[XPU] Add deepseek_scaling_rope fused kernel (vllm-project#36612)

a57018c

Signed-off-by: yitingw1 <yiting.wang@intel.com>

fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026

[XPU] Add deepseek_scaling_rope fused kernel (vllm-project#36612)

6110914

Signed-off-by: yitingw1 <yiting.wang@intel.com>

khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026

[XPU] Add deepseek_scaling_rope fused kernel (vllm-project#36612)

c7db1f0

Signed-off-by: yitingw1 <yiting.wang@intel.com>

Monishver11 pushed a commit to Monishver11/vllm that referenced this pull request Mar 27, 2026

[XPU] Add deepseek_scaling_rope fused kernel (vllm-project#36612)

0dbae7d

Signed-off-by: yitingw1 <yiting.wang@intel.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XPU] Add deepseek_scaling_rope fused kernel#36612

[XPU] Add deepseek_scaling_rope fused kernel#36612
jikunshang merged 8 commits intovllm-project:mainfrom
yitingw1:dev/xpu_customop_deepseek_rope

yitingw1 commented Mar 10, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jikunshang left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

jikunshang commented Mar 11, 2026

Uh oh!

mergify bot commented Mar 11, 2026

Uh oh!

yitingw1 commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

yitingw1 commented Mar 10, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jikunshang left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

jikunshang commented Mar 11, 2026

Uh oh!

mergify bot commented Mar 11, 2026

Uh oh!

yitingw1 commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yitingw1 commented Mar 10, 2026 •

edited by github-actions bot

Loading