Skip to content

Add SkipLayerNorm fusion with bias Add#27765

Merged
kunal-vaishnavi merged 21 commits intomainfrom
copilot/add-skiplayernorm-addbias
Mar 20, 2026
Merged

Add SkipLayerNorm fusion with bias Add#27765
kunal-vaishnavi merged 21 commits intomainfrom
copilot/add-skiplayernorm-addbias

Conversation

@kunal-vaishnavi
Copy link
Copy Markdown
Contributor

Description

This pull request introduces a new graph optimization pass to fuse Add + SkipLayerNormalization subgraphs into a single SkipLayerNormalization node that incorporates a bias input. This helps simplify the computation graph, especially for models using bias after MatMul, and extends support for more execution providers. The main changes include the implementation of the new fusion, its integration into the optimizer pipeline, and updates to provider compatibility.

New Bias + SkipLayerNormalization Fusion:

  • Added a new BiasSkipLayerNormFusion class and implementation to detect and fuse subgraphs where a 1D bias is added to a MatMul (optionally through a Cast) before SkipLayerNormalization, replacing them with a single node that absorbs the bias as a fifth input.

Integration into Optimization Pipeline:

  • Registered the new BiasSkipLayerNormFusion in the graph transformer utility, ensuring it runs after the standard SkipLayerNorm fusion and covers more execution providers (CPU, ACL, CUDA, DML, JS, WebGPU).

Test and Include Updates:

  • Updated test and implementation files to include the new fusion header where relevant.

Motivation and Context

These changes collectively improve model optimization by reducing node count and improving runtime efficiency for supported providers.

This PR also helps perform this fusion on many models inside the Foundry Local catalog without needing to re-deploy models.

Copilot AI and others added 4 commits March 19, 2026 07:38
Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
…dering bug

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new Level2 graph optimization pass that fuses an upstream bias Add into an existing com.microsoft::SkipLayerNormalization node by converting it from 4 inputs to 5 inputs (adding the bias input), and wires it into the standard optimizer pipeline for multiple execution providers.

Changes:

  • Introduces BiasSkipLayerNormFusion transformer to rewrite Add(MatMul[, Cast], bias_1D) -> SkipLayerNormalization into a single 5-input SkipLayerNormalization.
  • Registers the new fusion (and broadens SkipLayerNormFusion EP compatibility) in GenerateTransformers.
  • Adds unit tests covering positive and negative fusion cases for the new transformer.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
onnxruntime/test/optimizer/graph_transform_test_layernorm.cc Adds unit tests validating the new bias + SkipLayerNorm fusion behavior.
onnxruntime/core/optimizer/graph_transformer_utils.cc Registers BiasSkipLayerNormFusion in the Level2 transformer pipeline and extends compatible EP sets.
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.h Declares the new BiasSkipLayerNormFusion transformer.
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Implements pattern matching and graph rewrite to absorb a constant 1D bias Add into SkipLayerNormalization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@kunal-vaishnavi kunal-vaishnavi requested a review from Copilot March 19, 2026 17:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new graph optimization pass that fuses an upstream bias Add into an existing com.microsoft.SkipLayerNormalization node (moving from 4 inputs to 5 inputs), and wires it into the Level2 transformer pipeline (including more execution providers), with accompanying unit tests.

Changes:

  • Introduces BiasSkipLayerNormFusion transformer to absorb Add(+1D bias) (optionally via Cast) into SkipLayerNormalization.
  • Registers the new transformer in the Level2 optimization pipeline and expands SLN fusion EP compatibility to include JS/WebGPU.
  • Adds targeted unit tests covering positive and negative fusion cases.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
onnxruntime/test/optimizer/graph_transform_test_layernorm.cc Adds unit tests validating Bias+SLN fusion behavior (and non-fusion conditions).
onnxruntime/core/optimizer/graph_transformer_utils.cc Registers the new transformer and updates compatible EP sets for SLN-related fusions.
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.h Declares the BiasSkipLayerNormFusion graph transformer.
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Implements the Add(+bias) → SLN(5 inputs) fusion logic (with optional Cast handling).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/test/optimizer/graph_transform_test_layernorm.cc
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.h Outdated
kunal-vaishnavi and others added 3 commits March 19, 2026 10:44
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@kunal-vaishnavi kunal-vaishnavi requested a review from Copilot March 19, 2026 17:47
tianleiwu
tianleiwu previously approved these changes Mar 19, 2026
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new optimizer pass to fuse Add(+1D bias) + com.microsoft::SkipLayerNormalization(4 inputs) into a single SkipLayerNormalization node with a 5th bias input, and wires it into the Level2 transformer pipeline for additional EPs.

Changes:

  • Introduces BiasSkipLayerNormFusion graph transformer to absorb an upstream bias Add (optionally via Cast) into SkipLayerNormalization.
  • Registers the new fusion pass (and broadens EP allowlist for SkipLayerNorm fusion) in the transformer generation pipeline.
  • Adds unit tests covering positive and negative fusion cases.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
onnxruntime/test/optimizer/graph_transform_test_layernorm.cc Adds unit tests validating BiasSkipLayerNormFusion behavior (fuse + no-fuse scenarios).
onnxruntime/core/optimizer/graph_transformer_utils.cc Registers the new fusion pass and expands compatible EP list for SkipLayerNorm-related fusions.
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.h Declares the new BiasSkipLayerNormFusion transformer and documents the before/after pattern.
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Implements pattern matching and node rewrite to create 5-input SkipLayerNormalization and remove obsolete nodes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/test/optimizer/graph_transform_test_layernorm.cc
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new graph optimization to fuse an upstream bias Add (optionally via Cast from MatMul) into an existing 4-input com.microsoft.SkipLayerNormalization, producing a single 5-input SkipLayerNormalization with bias.

Changes:

  • Introduces BiasSkipLayerNormFusion transformer to absorb 1D constant bias adds into SkipLayerNormalization.
  • Registers the new fusion in the Level2 transformer pipeline and expands compatible EP list for SkipLayerNorm fusions (CPU/ACL/CUDA/DML/JS/WebGPU).
  • Adds targeted optimizer tests covering positive and negative fusion scenarios (including Cast and mismatch cases).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
onnxruntime/test/optimizer/graph_transform_test_layernorm.cc Adds unit tests validating bias+SLN fusion behavior and non-fusion guards
onnxruntime/core/optimizer/graph_transformer_utils.cc Registers the new transformer and updates compatible EP sets
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.h Declares the new BiasSkipLayerNormFusion transformer
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Implements pattern matching and graph rewrite to add bias as 5th SLN input

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new optimizer pass to fuse an upstream bias Add into an existing 4-input com.microsoft.SkipLayerNormalization, producing a 5-input SLN that directly consumes the bias (including a MatMul→Cast→Add variant), and wires it into the Level2 transformer pipeline with accompanying unit tests.

Changes:

  • Introduced BiasSkipLayerNormFusion graph transformer to absorb 1D constant bias adds into SkipLayerNormalization.
  • Registered the new fusion (and expanded SLN fusion EP allowlist) in the transformer generation pipeline.
  • Added multiple structural tests covering positive and negative fusion scenarios, including Cast and shape-mismatch cases.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
onnxruntime/test/optimizer/graph_transform_test_layernorm.cc Adds unit tests validating Bias+SLN fusion behavior and non-fusion guardrails
onnxruntime/core/optimizer/graph_transformer_utils.cc Registers the new fusion pass and updates the EP compatibility set used for SLN fusions
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.h Declares the new BiasSkipLayerNormFusion transformer
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Implements the fusion logic for Add(+optional Cast)+MatMul feeding SLN

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@kunal-vaishnavi kunal-vaishnavi requested a review from Copilot March 19, 2026 18:50
…etween Path 1 and Path 2

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new optimizer pass to fuse Add(+1D bias) into an existing 4-input com.microsoft::SkipLayerNormalization, producing a single 5-input SkipLayerNormalization node (with bias as the 5th input), and wires the pass into the Level2 transformer pipeline for additional EPs.

Changes:

  • Added BiasSkipLayerNormFusion graph transformer implementation (supports MatMul→Add or MatMul→Cast→Add patterns).
  • Registered the new fusion (and expanded SkipLayerNormFusion EP set) in GenerateTransformers.
  • Added unit tests covering positive/negative fusion scenarios.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
onnxruntime/test/optimizer/graph_transform_test_layernorm.cc Adds unit tests for BiasSkipLayerNormFusion patterns and non-fusion cases
onnxruntime/core/optimizer/graph_transformer_utils.cc Registers BiasSkipLayerNormFusion in Level2 pipeline and expands compatible EP set
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.h Declares new fusion transformer
onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Implements the Add(+bias) → SkipLayerNormalization(5 inputs) rewrite

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc
Comment thread onnxruntime/test/optimizer/graph_transform_test_layernorm.cc
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc
…s, add downstream-consumer test

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Comment thread onnxruntime/test/optimizer/graph_transform_test_layernorm.cc Outdated
Copilot AI and others added 2 commits March 19, 2026 21:18
…ction error

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
…27: fix test comment

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/test/optimizer/graph_transform_test_layernorm.cc
Comment thread onnxruntime/test/optimizer/graph_transform_test_layernorm.cc
Comment thread onnxruntime/test/optimizer/graph_transform_test_layernorm.cc
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc
Comment thread onnxruntime/test/optimizer/graph_transform_test_layernorm.cc
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

… bias{1} instead of invalid bias{3}

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

1. Hidden-Size Validation (onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc)

Positive:

  • The fusion keeps input[0]/input[1] stable and snapshots both outputs and outgoing edges before removal, which is exactly the right shape-preserving pattern for SkipLayerNormalization.

Concern:

  • ⚠️ Can fuse an invalid bias when gamma/beta shapes are unknown: the new guard only compares bias_hidden_size against the hidden size derived from gamma/beta. If those inputs are graph inputs without static 1-D shape info, get_sln_hidden_size() returns -1, so a broadcast-valid bias{1} will still be absorbed. That is a behavior change: Add(x, bias{1}) is legal, but the fused 5-input SkipLayerNormalization requires bias.shape[0] == hidden_size and will later fail CheckBias() at runtime. In other words, the optimizer can turn a valid unfused graph into an invalid fused graph.
    int64_t hidden_size = get_sln_hidden_size(sln_node);
    if (hidden_size == -1) {
      hidden_size = GetLastDimValue(*candidate_add->MutableInputDefs()[add_matmul_input_idx]);
    }
    
    // If we still can't prove the bias matches the hidden size, bail out.
    if (hidden_size == -1 || bias_hidden_size == -1 || hidden_size != bias_hidden_size) {
      return false;
    }

2. Regression Coverage (onnxruntime/test/optimizer/graph_transform_test_layernorm.cc, onnxruntime/core/optimizer/graph_transformer_utils.cc)

Positive:

  • The new tests cover the main wiring variants well: bias on either Add input, SkipLayerNormalization input 0 vs 1, the fp16 Cast path, multi-consumer rejection, and downstream edge rewiring.

Concern:

  • ⚠️ The missing negative case is exactly the one that breaks today: every new test uses statically-shaped initializer gamma/beta, so the hidden-size mismatch guard is always informed by compile-time metadata. There is no negative test where gamma/beta are dynamic inputs or otherwise lack shape info, and there is also no provider-tagged coverage for the new JS/WebGPU allowlist. A focused negative test for bias{1} with unknown gamma/beta shapes would have caught the issue above immediately.

Summary of Concerns

# Severity Component Issue
1 High bias_skip_layer_norm_fusion.cc Fusion can absorb a broadcast bias that 5-input SkipLayerNormalization cannot legally accept when gamma/beta shapes are unknown.
2 Suggestion Tests / transformer registration Missing dynamic-shape and provider-tagged coverage leaves the new guard and JS/WebGPU rollout under-tested.

Verdict

REQUEST CHANGES — the hidden-size validation needs one more safety check before this fusion is safe on partially-shaped graphs.

…rNormFusion

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
tianleiwu
tianleiwu previously approved these changes Mar 19, 2026
@kunal-vaishnavi kunal-vaishnavi enabled auto-merge (squash) March 19, 2026 23:52
Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Fixed
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

Comment thread onnxruntime/core/optimizer/bias_skip_layer_norm_fusion.cc Outdated
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@kunal-vaishnavi kunal-vaishnavi merged commit 405dcd7 into main Mar 20, 2026
91 checks passed
@kunal-vaishnavi kunal-vaishnavi deleted the copilot/add-skiplayernorm-addbias branch March 20, 2026 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants