Skip to content

Fix WebGPU MoE swiglu_limit (default to infinity)#27221

Merged
hanbitmyths merged 1 commit intomicrosoft:mainfrom
xenova:moe-swiglu_limit-fix
Feb 3, 2026
Merged

Fix WebGPU MoE swiglu_limit (default to infinity)#27221
hanbitmyths merged 1 commit intomicrosoft:mainfrom
xenova:moe-swiglu_limit-fix

Conversation

@xenova
Copy link
Copy Markdown
Contributor

@xenova xenova commented Feb 1, 2026

Description

According to https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#commicrosoftmoe,

swiglu_limit : float
The limit used to clamp in SwiGLU. No clamp when limit is not provided.

However, currently, the default is set to 0, meaning we clamp to 0 if no limit is provided.

Motivation and Context

Fixes #27220. See there for bug description and reproduction.

Hoping to get this in before 1.24.0 releases. cc @guschmue

@tianleiwu
Copy link
Copy Markdown
Contributor

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 4 pipeline(s).

@guschmue guschmue enabled auto-merge (squash) February 2, 2026 17:24
@guschmue
Copy link
Copy Markdown
Contributor

guschmue commented Feb 2, 2026

nice catch!

@hanbitmyths hanbitmyths disabled auto-merge February 3, 2026 19:25
@hanbitmyths hanbitmyths merged commit 4abba28 into microsoft:main Feb 3, 2026
80 of 97 checks passed
tianleiwu pushed a commit that referenced this pull request Feb 3, 2026
### Description

According to
https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#commicrosoftmoe,

> swiglu_limit : float
The limit used to clamp in SwiGLU. No clamp when limit is not provided.

However, currently, the default is set to 0, meaning we clamp to 0 if no
limit is provided.


### Motivation and Context

Fixes #27220. See there for bug description and reproduction.

Hoping to get this in before 1.24.0 releases. cc @guschmue
@ekzhang
Copy link
Copy Markdown

ekzhang commented Feb 3, 2026

Yes this is really nice haha :D

tianleiwu added a commit that referenced this pull request Feb 3, 2026
Cherry-pick round 1 for 1.24.1 release:
#27157: [Fix: Replace
pkg_resources with importlib.metadata in
machine_info.py](40469f0)


#27124: [Remove x86 from
nuget
(#27124)](5c98f5c)


#26390: [[MLAS] Fix rotary
interleaved NEON
kernel](536c6c9)


#27215: [Fix Conv LHS
packing padding/uninitialized ptrs
V2](62a3890)


#27221: [Fix WebGPU MoE
swiglu_limit (default to
infinity)](98b6ce9)


#26994: [Fix for
https://github.com/microsoft/onnxruntime/issues/25145](https://github.com/microsoft/onnxruntime/commit/bce7b4faca24ae2ae279ab8fa2de637a46e7f45b)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com>
Co-authored-by: Joshua Lochner <admin@xenova.com>
Co-authored-by: umangb-09 <umangb@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[WebGPU] com.microsoft.QMoE produces invalid results for certain attribute combinations

5 participants