Fix SkipLayerNorm fusion incorrectly applied when gamma/beta are not 1D#27459
Merged
Fix SkipLayerNorm fusion incorrectly applied when gamma/beta are not 1D#27459
Conversation
Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix bug related to issue 27455 in onnxruntime
Fix SkipLayerNorm fusion incorrectly applied when gamma/beta are not 1D
Feb 25, 2026
xadupre
approved these changes
Feb 26, 2026
tianleiwu
added a commit
that referenced
this pull request
Feb 27, 2026
…1D (#27459) ### Description The `SkipLayerNormFusion` optimizer skips fusion when the `LayerNormalization` gamma or beta inputs are not 1D tensors (e.g. shape `[1, 1, hidden_size]`). The `SkipLayerNormalization` kernel strictly requires 1D gamma/beta, so fusing without this check caused a hard runtime error. - **`skip_layer_norm_fusion.cc`**: After matching the Add+LayerNorm pattern, check that gamma (and beta if present) have exactly 1 dimension before proceeding with fusion. If shape info is unavailable (dynamic), fusion is allowed and runtime validation takes over. - **`graph_transform_test_layernorm.cc`**: Added `SkipLayerNormFusion_3DGamma_NoFusion` test — builds a graph with `Add + LayerNormalization` where gamma/beta are `[1, 1, 4]` and asserts no `SkipLayerNormalization` node is created. ### Motivation and Context Models with residual connections followed by `LayerNormalization` where the scale/bias tensors carry extra batch/sequence dimensions (e.g. exported as `[1, 1, hidden_size]` rather than `[hidden_size]`) would trigger fusion and then fail at runtime: ``` Non-zero status code returned while running SkipLayerNormalization node. Status Message: gamma is expected to have 1 dimension, got 3 ``` The error only appeared with 3D inputs and disappeared at `ORT_ENABLE_BASIC` optimization level (which disables the fusion), confirming the optimizer as the source of the regression. <!-- START COPILOT CODING AGENT TIPS --> --- 💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey). --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
tianleiwu
added a commit
that referenced
this pull request
Feb 27, 2026
This cherry-picks the following commits for the release: | Commit ID | PR Number | Commit Title | |-----------|-----------|-------------| | decd177 | #27090 | Fix GatherND division by zero when batch dimensions mismatch | | 55f8234 | #27360 | Fix QMoE CPU Operator | | df9146f | #27403 | [MLAS] Adding DynamicQGemm function pointers and ukernel interface | | 0f93853 | #27318 | [js/web] Use embedded WASM module in Blob URL workers when wasmBinary is provided | | b2a6e69 | #27364 | QMoE CPU Performance Update (Up to 4x on 4-bit) | | f501e1d | #27413 | Fix refcount bug in map input conversion that caused shutdown segfault | | b32b205 | #27421 | Fix error where bytes is not assigned for dynamic qgemm pack b size | | 426b006 | #27397 | Fix DllImportResolver | | 0982844 | #27412 | MatmulNBits prepacking scales fix | | 9afb0d2 | #27430 | Fix validation for external data paths for models loaded from bytes | | 71d2cd0 | #27401 | Enable Python 3.14 CI and Upgrade Dependencies | | 79e0676 | #27419 | fix: out of bounds access for resize operation | | 82eb99c | #27459 | Fix SkipLayerNorm fusion incorrectly applied when gamma/beta are not 1D | | 355278a | #27444 | Fix GatherCopyData Integer Truncation Leading to Heap Out-of-Bounds Read/Write | | cf96123 | #27411 | [web] fix usage of wasmBinary together with a blob URL for .mjs | | 1131a86 | #27399 | [web] remove the unhelpful "Unknown CPU vendor" warning. | | ffbbc4f | #27316 | Build Windows ARM64X binaries as part of packaging pipeline | --------- Signed-off-by: Jonathan Clohessy <Jonathan.Clohessy@arm.com> Co-authored-by: patryk-kaiser-ARM <patryk.kaiser@arm.com> Co-authored-by: don <70039285+0-don@users.noreply.github.com> Co-authored-by: Jonathan Clohessy <jonathan.clohessy@arm.com> Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com> Co-authored-by: Lukas Folle <126877803+lukas-folle-snkeos@users.noreply.github.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com> Co-authored-by: Chaya <cha182350@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Erik <erscor@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
This was referenced Mar 9, 2026
This was referenced Mar 16, 2026
deps(nuget): Bump the microsoft-packages group with 8 updates
Ellerbach/azure-ai-search-simulator#76
Closed
This was referenced Mar 23, 2026
deps(nuget): Bump the microsoft-packages group with 8 updates
Ellerbach/azure-ai-search-simulator#80
Closed
This was referenced Mar 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The
SkipLayerNormFusionoptimizer skips fusion when theLayerNormalizationgamma or beta inputs are not 1D tensors (e.g. shape[1, 1, hidden_size]). TheSkipLayerNormalizationkernel strictly requires 1D gamma/beta, so fusing without this check caused a hard runtime error.skip_layer_norm_fusion.cc: After matching the Add+LayerNorm pattern, check that gamma (and beta if present) have exactly 1 dimension before proceeding with fusion. If shape info is unavailable (dynamic), fusion is allowed and runtime validation takes over.graph_transform_test_layernorm.cc: AddedSkipLayerNormFusion_3DGamma_NoFusiontest — builds a graph withAdd + LayerNormalizationwhere gamma/beta are[1, 1, 4]and asserts noSkipLayerNormalizationnode is created.Motivation and Context
Models with residual connections followed by
LayerNormalizationwhere the scale/bias tensors carry extra batch/sequence dimensions (e.g. exported as[1, 1, hidden_size]rather than[hidden_size]) would trigger fusion and then fail at runtime:The error only appeared with 3D inputs and disappeared at
ORT_ENABLE_BASICoptimization level (which disables the fusion), confirming the optimizer as the source of the regression.💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.