fix(security): add SafeInt overflow protection in Expand and constant folding output size limit by tianleiwu · Pull Request #28055 · microsoft/onnxruntime

tianleiwu · 2026-04-14T00:07:24Z

Description

Harden the constant folding optimizer and the Expand CPU kernel against integer overflow attacks from crafted ONNX models.

Problem: The Expand::Compute() kernel performs cumulative dimension multiplications (input_count *= input_dim, output_count *= output_dim) using raw int64_t arithmetic. When triggered during constant folding at CreateSession() time via a crafted model with extreme shape values, signed integer overflow can produce corrupted values used for buffer offset calculations and memcpy lengths, creating a potential out-of-bounds write. The downstream SafeInt check in the allocator catches overflow only when the total byte count wraps, but carefully chosen dimensions can make the overflowed value appear valid.

Additionally, the constant folding optimizer has no output size budget — any deterministic node with constant inputs is eligible for constant folding regardless of output size, enabling memory exhaustion attacks at model load time.

Key Changes

1. SafeInt-protected arithmetic in expand.cc

Wraps all dimension accumulation and offset/length calculations with SafeInt<int64_t> or SafeInt<size_t> to catch overflow before it can corrupt buffer arithmetic:

Location	Before	After
Accumulator loop (L97-98)	`input_count *= input_dim`	`SafeInt<int64_t>(input_count) * input_dim`
Accumulator loop (L109)	`last_dim_size *= expand_dim_size[...]`	`SafeInt<int64_t>(last_dim_size) * ...`
copy_byte (L116)	`copy_len * sizeof(T)`	`SafeInt<size_t>(copy_len) * sizeof(T)`
input_offset (L122)	`i * copy_len`	`SafeInt<int64_t>(i) * copy_len`
output_offset (L126)	`output_offset += current_count * ...`	`SafeInt<int64_t>(output_offset) + SafeInt<int64_t>(current_count) * ...`

2. Constant folding output size limit in constant_folding.cc

Pre-execution check: EstimateNodeOutputSizeInBytes() uses shape inference results with SafeInt-protected arithmetic to estimate total output bytes. Nodes exceeding the limit are skipped.
Post-execution check: After kernel->Compute(), actual output SizeInBytes() is verified against the limit (catches cases where shape inference couldn't determine output size).
Exception isolation: kernel->Compute() is wrapped in try/catch so that SafeInt overflow exceptions from individual nodes skip the node rather than aborting the entire optimization pass.
Configurable limit: New session option optimization.constant_folding_max_output_size_in_bytes (default: 1 GB, "0" to disable).

3. Session option

New key kOrtSessionOptionsConstantFoldingMaxOutputSizeInBytes in onnxruntime_session_options_config_keys.h.

Motivation and Context

This addresses a security vulnerability where a malicious ONNX model can cause signed integer overflow in the Expand kernel during constant folding at model load time (CreateSession()), potentially leading to out-of-bounds memory writes. The constant folding size limit provides defense-in-depth against memory exhaustion attacks from untrusted models.

Testing

ConstantFoldingOutputSizeLimit — Verifies 4 MB Expand is blocked at 1 MB limit, allowed at 8 MB limit.
ConstantFoldingDefaultLimitBlocksLargeExpand — Verifies 1 GB ConstantOfShape is blocked at 512 MB limit.
ConstantFoldingSmallOutputAllowed — Verifies small Expand (64 bytes) is still folded normally.
ConstantFoldingExpandOverflowDimsSkipped — Verifies Expand with [2^32, 2^32] dimensions (int64 overflow) is gracefully skipped during constant folding.

…size check The post-execution size check was calling IsTensor() and then Get<Tensor>().SizeInBytes() on OrtValue entries from fetches. For optional outputs that are not produced, IsTensor() returns true (the type is set) but the data pointer is null, causing a SEGFAULT when SizeInBytes() dereferences the null tensor's dtype_ member. Add IsAllocated() check to skip unallocated (optional/None) outputs.

Copilot

Pull request overview

Hardens ONNX Runtime’s constant folding optimizer and the CPU Expand kernel against crafted models that could trigger integer overflow or excessive allocations during session initialization.

Changes:

Adds SafeInt-guarded arithmetic in Expand to prevent overflow in element-count and offset/length calculations.
Introduces a per-node constant folding output size budget (configurable via a new session option) with pre-/post-checks.
Adds optimizer tests validating folding behavior under size limits and overflow scenarios.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
onnxruntime/test/optimizer/graph_transform_test.cc	Adds new tests for constant folding size limits and overflow-skipping behavior.
onnxruntime/core/providers/cpu/tensor/expand.cc	Wraps key multiplications/offset computations with `SafeInt` to prevent overflow.
onnxruntime/core/optimizer/constant_folding.cc	Adds output size estimation/limits and exception isolation around constant folding execution.
include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h	Adds new session config key for constant folding max output size.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…e constant folding (#28751) A 152-byte ONNX model with a `ConstantOfShape` whose shape initializer encodes huge dims causes `InferenceSession::Initialize()` to materialize the full output tensor (287 MB in the PoC, up to PB-scale with fuzz-mutated dims) via the ConstantFolding optimizer. The existing pre-execution size cap in `EstimateNodeOutputSizeInBytes` relies on shape inference having populated `output_def->Shape()`, which is not guaranteed. ### Description - **`onnxruntime/core/optimizer/constant_folding.cc`** - New `EstimateConstantOfShapeOutputSizeInBytes(node, graph)`: looks up the shape input via `Graph::GetConstantInitializer` (it is constant by the time we reach this node), multiplies its int64 values with `SafeInt<int64_t>` (rejects negative dims, lets overflow propagate as an exception caught upstream), and multiplies by the element size derived from the `value` attribute's tensor type (defaulting to float per ONNX spec). - `EstimateNodeOutputSizeInBytes` now takes `const Graph&` and dispatches to the new estimator for `ConstantOfShape`, falling back to the generic shape-based path if the initializer can't be resolved. - **`onnxruntime/test/optimizer/graph_transform_test.cc`** - `ConstantFoldingConstantOfShapeUsesInputInitializerForSizeCheck`: shape `[100M]` with int64 `value` ⇒ 800 MB derived size; with `kOrtSessionOptionsConstantFoldingMaxOutputSizeInBytes=256MB` the node must remain unfolded, proving the size check fires from the initializer alone. - `ConstantFoldingConstantOfShapeBlockedWhenOutputShapeMissing`: same model, but the `pre_graph_checker` (which runs after `Graph::Resolve()` and before the transformer) calls `ClearShape()` on the ConstantOfShape output NodeArg to simulate the documented attack where shape inference has not propagated the output shape. With the inferred shape stripped, the generic shape-based estimator returns -1, so only the new `EstimateConstantOfShapeOutputSizeInBytes` path can derive the 800 MB size and block folding — isolating regressions in the new estimator from the pre-existing shape-inference path. ### Motivation and Context The byte cap added in #28055 only triggers when shape inference has propagated the output shape; a crafted model can bypass it and force unbounded allocation during `Initialize()`. Deriving the size from the (necessarily constant) shape input makes the cap effective for the documented attack vector and tightens the same code path used by the configurable `kOrtSessionOptionsConstantFoldingMaxOutputSizeInBytes` setting — no new knob, no behavior change for legitimate models within the existing 1 GB default. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Xavier Dupré <xadupre@microsoft.com>

tianleiwu added 5 commits April 13, 2026 16:41

Limit output size of constant folding

4b9c93f

Merge main

1ec4d99

Add safe int in Expand and a const folding test

aeff70f

Merge remote-tracking branch 'origin/main' into tlwu/safe_const_folding

9e52c82

tianleiwu requested a review from Copilot April 16, 2026 04:02

Copilot started reviewing on behalf of tianleiwu April 16, 2026 04:03 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

tianleiwu requested a review from skottmckay April 16, 2026 23:20

fix: address constant folding review feedback

de3824d

tianleiwu requested review from Copilot and yuslepukhin April 17, 2026 05:07

Copilot started reviewing on behalf of tianleiwu April 17, 2026 23:21 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Comment thread onnxruntime/core/optimizer/constant_folding.cc

xadupre reviewed Apr 23, 2026

View reviewed changes

Comment thread onnxruntime/core/optimizer/constant_folding.cc

tianleiwu requested a review from xadupre April 23, 2026 21:48

xadupre reviewed May 14, 2026

View reviewed changes

Comment thread onnxruntime/core/optimizer/constant_folding.cc

xadupre approved these changes May 14, 2026

View reviewed changes

tianleiwu merged commit 04f7440 into main May 14, 2026
100 checks passed

tianleiwu deleted the tlwu/safe_const_folding branch May 14, 2026 21:08

tianleiwu mentioned this pull request May 14, 2026

[Security] Use safe int in Expand CPU op #28485

Closed

Copilot AI mentioned this pull request Jun 2, 2026

Pre-check ConstantOfShape output size against input initializer before constant folding #28751

Merged

BrewTestBot mentioned this pull request Jun 19, 2026

onnxruntime 1.27.0 Homebrew/homebrew-core#288892

Merged

This was referenced Jun 29, 2026

Bump the onnx-runtime group with 1 update HeliosophLLC/DatumV#44

Open

deps(nuget): Bump the nuget-all group with 12 updates msbrettorg/maenifold#88

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(security): add SafeInt overflow protection in Expand and constant folding output size limit#28055

fix(security): add SafeInt overflow protection in Expand and constant folding output size limit#28055
tianleiwu merged 6 commits into
mainfrom
tlwu/safe_const_folding

tianleiwu commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

tianleiwu commented Apr 14, 2026

Description

Key Changes

Motivation and Context

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants