Skip to content

feat(deepseek_rope): add deepseek_scaling_rope#34

Merged
jikunshang merged 1 commit intovllm-project:mainfrom
dbyoung18:dbyoung/deepseek_rope
Sep 12, 2025
Merged

feat(deepseek_rope): add deepseek_scaling_rope#34
jikunshang merged 1 commit intovllm-project:mainfrom
dbyoung18:dbyoung/deepseek_rope

Conversation

@dbyoung18
Copy link
Copy Markdown
Collaborator

  • add sycl kernel deepseek_scaling_rope to _xpu_C
  • verified accuracy pass, speedup from 1msec->6.4usec
  • current is xpu only while upstream using naive torch.ops
  • may have chance to contribute upstream along w/ CUDA/Triton version

Signed-off-by: Double Young <yang5.yang@intel.com>
Comment thread csrc/xpu/ops.h
@jikunshang jikunshang merged commit d30e6f2 into vllm-project:main Sep 12, 2025
3 checks passed
@dbyoung18 dbyoung18 deleted the dbyoung/deepseek_rope branch September 13, 2025 01:17
@dbyoung18 dbyoung18 restored the dbyoung/deepseek_rope branch September 13, 2025 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants