Skip to content

modify split_qkv_rmsnorm_rope#282

Merged
iforgetmyname merged 1 commit intosgl-project:mainfrom
Liwansi:main_1226
Dec 26, 2025
Merged

modify split_qkv_rmsnorm_rope#282
iforgetmyname merged 1 commit intosgl-project:mainfrom
Liwansi:main_1226

Conversation

@Liwansi
Copy link
Copy Markdown
Contributor

@Liwansi Liwansi commented Dec 26, 2025

make the normalization optional to support llama models.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@iforgetmyname iforgetmyname merged commit c7fcd82 into sgl-project:main Dec 26, 2025
2 of 4 checks passed
oagniqgnat added a commit to oagniqgnat/sgl-kernel-npu that referenced this pull request Dec 27, 2025
* upstream/main:
  modify split_qkv_rmsnorm_rope (sgl-project#282)
  bump version to 2025.12.25 (sgl-project#281)
  l2 norm const parameter change (sgl-project#276)
  Fix the issue of HCCL buffer tiling verification failure during one round of testing. (sgl-project#280)
@ZhongsJie
Copy link
Copy Markdown

This PR may affect the current Qwen3 model support. sgl-project/sglang#12078.
Could you help confirm whether compatibility with that change has been considered in our current implementation?
Alternatively, is there a pending SGLang PR that still needs to be merged? @Liwansi

@Liwansi
Copy link
Copy Markdown
Contributor Author

Liwansi commented Dec 30, 2025

This PR may affect the current Qwen3 model support. sgl-project/sglang#12078. Could you help confirm whether compatibility with that change has been considered in our current implementation? Alternatively, is there a pending SGLang PR that still needs to be merged? @Liwansi

Yes, I have considered this. The change introduced in this PR makes the normalization component of the split_qkv_rmsnorm_rope operator optional, thereby enabling support for Llama models. A relevant PR has already been submitted to SGLang and is awaiting merge.

@ZhongsJie
Copy link
Copy Markdown

@Liwansi Great! Could you please share the related PR link/address?

zzx-study added a commit to zzx-study/sgl-kernel-npu that referenced this pull request Jan 9, 2026
…pu-old into bugfix

* 'a3_topk-1' of https://github.com/luanyundu/sgl-kernel-npu-old:
  fix dispatch_layout to support topk -1 feature
  optimize gdn gating and fused_qkvzba_split_reshape_cat (sgl-project#306)
  fix layout numTokensPerExpertTensor partial Initialization bug (sgl-project#303)
  Supplement A2 doc, software and hardware compatibility info (sgl-project#294)
  Added an environment variable to control whether to enable the Combine Ant Migration feature. (sgl-project#304)
  Support build with cann 8.5 (sgl-project#283)
  LoRA: Optimization LoRA kernels and refactoring (sgl-project#284)
  fix a2 single combine aclnn params
  Resolving the UB out-of-bounds issue caused by A2 dual-machine mixed operation (sgl-project#288)
  fix notify magic auto-increment bug (sgl-project#291)
  split_qkv_rmsnorm_rope bugfix (sgl-project#290)
  Optimize prepare_lens by removing device transfer (sgl-project#289)
  Fix the performance degradation issue of the single-wheel operation in Ant Moving. (sgl-project#287)
  modify split_qkv_rmsnorm_rope (sgl-project#282)
AndyKong2020 pushed a commit to AndyKong2020/sgl-kernel-npu that referenced this pull request Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants