Skip to content

+rotemb, +rmsnorm, reshape->opset-25, transpose->opset-24#27752

Merged
guschmue merged 3 commits intomainfrom
gs/wgpu-rms-rot-reshape
Apr 6, 2026
Merged

+rotemb, +rmsnorm, reshape->opset-25, transpose->opset-24#27752
guschmue merged 3 commits intomainfrom
gs/wgpu-rms-rot-reshape

Conversation

@guschmue
Copy link
Copy Markdown
Contributor

@guschmue guschmue commented Mar 18, 2026

for webgpu ep:

  • onnx rotary-embedding op
  • onnx rmsnorm
  • reshape-> opset-25
  • transpose -> opset-24

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Mar 18, 2026
@guschmue guschmue marked this pull request as ready for review March 19, 2026 01:00
fs-eire
fs-eire previously approved these changes Mar 19, 2026
Comment thread onnxruntime/core/providers/webgpu/llm/rotary_embedding.cc
Comment thread onnxruntime/core/providers/webgpu/llm/rotary_embedding.cc
@guschmue guschmue merged commit 410f5a8 into main Apr 6, 2026
104 of 108 checks passed
@guschmue guschmue deleted the gs/wgpu-rms-rot-reshape branch April 6, 2026 19:48
sanaa-hamel-microsoft pushed a commit that referenced this pull request Apr 21, 2026
for webgpu ep:
+ onnx rotary-embedding op
+ onnx rmsnorm
+ reshape-> opset-25
+ transpose -> opset-24
sanaa-hamel-microsoft added a commit that referenced this pull request Apr 24, 2026
Version bump to 1.25.1.

This cherry-picks the following commits for the release:

| Commit ID | PR Number | Commit Title |
|-----------|-----------|-------------|
| e532c21 | #27842 | linear attention signature |
| 410f5a8 | #27752 | +rotemb, +rmsnorm, reshape->opset-25,
transpose->opset-24 |
| 0fedb26 | #27907 | Add LinearAttention and CausalConvState ops for
Qwen3.5 |
| 3ac6040 | #27996 | webgpu support for qwen3.5 |
| c36c422 | #27998 | [WebGPU EP] Fuse QMoE 1-token decode path to
reduce GPU dispatches |
| 94f32ec | #27289 | [CORE]: Improve filesystem error messages during
Linux device discovery |
| dce77a3 | #28118 | Fix lack of auth on python packaging |

---------

Co-authored-by: Akshay Sonawane <111780983+apsonawane@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: eserscor <erscor@microsoft.com>
Co-authored-by: Sanaa Hamel <sanaahamel@microsoft.com>
Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
Co-authored-by: Stephan Seitz <sseitz@nvidia.com>
Co-authored-by: Jiajia Qin <jiajiaqin@microsoft.com>
Copy link
Copy Markdown

@almassolarenrgi almassolarenrgi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider release:1.25.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants