Skip to content

[sgl-kernel][1/2] Fused qk_norm_rope for Qwen3-MoE#14036

Merged
BBuf merged 1 commit intosgl-project:mainfrom
antgroup:fused_qk_norm_rope_kernel
Nov 28, 2025
Merged

[sgl-kernel][1/2] Fused qk_norm_rope for Qwen3-MoE#14036
BBuf merged 1 commit intosgl-project:mainfrom
antgroup:fused_qk_norm_rope_kernel

Conversation

@yuan-luo
Copy link
Collaborator

Motivation

This is the CUDA kernel to fuse Q K normalization and apply_rope_embedding.
More details refer to #13998.

➜  sglang_dev git:(fused_qk_norm_rope) ✗ python -m pytest ./sgl-kernel/tests/test_fused_qk_norm_rope.py
=============================================================================================================== test session starts ===============================================================================================================
platform linux -- Python 3.12.12, pytest-8.4.2, pluggy-1.6.0
rootdir: /sgl-workspace/workspace/sglang_dev/sgl-kernel
configfile: pyproject.toml
plugins: anyio-4.11.0, typeguard-4.4.4
collected 80 items

sgl-kernel/tests/test_fused_qk_norm_rope.py ................................................................................                                                                                                                [100%]

================================================================================================================ warnings summary =================================================================================================================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================================================= 80 passed, 2 warnings in 11.55s =========================================================================================================
sys:1: DeprecationWarning: builtin type swigvarlink has no __module__ attribute

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@Kangyan-Zhou
Copy link
Collaborator

/tag-and-rerun-ci

@yuan-luo yuan-luo changed the title [sgl_kernel][2/2] Fused qk_norm_rope for Qwen3-MoE [sgl-kernel][2/2] Fused qk_norm_rope for Qwen3-MoE Nov 27, 2025
@yuan-luo yuan-luo force-pushed the fused_qk_norm_rope_kernel branch from 541a8a6 to 93bd808 Compare November 27, 2025 15:56
@yuan-luo yuan-luo changed the title [sgl-kernel][2/2] Fused qk_norm_rope for Qwen3-MoE [sgl-kernel][1/2] Fused qk_norm_rope for Qwen3-MoE Nov 27, 2025
Copy link
Collaborator

@BBuf BBuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BBuf BBuf merged commit e12c78a into sgl-project:main Nov 28, 2025
201 of 210 checks passed
harvenstar pushed a commit to harvenstar/sglang that referenced this pull request Dec 4, 2025
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants