Skip to content

Pre-check ConstantOfShape output size against input initializer before constant folding#28751

Merged
tianleiwu merged 4 commits into
mainfrom
copilot/fix-uncontrolled-memory-allocation
Jun 3, 2026
Merged

Pre-check ConstantOfShape output size against input initializer before constant folding#28751
tianleiwu merged 4 commits into
mainfrom
copilot/fix-uncontrolled-memory-allocation

Conversation

Copilot AI commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

A 152-byte ONNX model with a ConstantOfShape whose shape initializer encodes huge dims causes InferenceSession::Initialize() to materialize the full output tensor (287 MB in the PoC, up to PB-scale with fuzz-mutated dims) via the ConstantFolding optimizer. The existing pre-execution size cap in EstimateNodeOutputSizeInBytes relies on shape inference having populated output_def->Shape(), which is not guaranteed.

Description

  • onnxruntime/core/optimizer/constant_folding.cc

    • New EstimateConstantOfShapeOutputSizeInBytes(node, graph): looks up the shape input via Graph::GetConstantInitializer (it is constant by the time we reach this node), multiplies its int64 values with SafeInt<int64_t> (rejects negative dims, lets overflow propagate as an exception caught upstream), and multiplies by the element size derived from the value attribute's tensor type (defaulting to float per ONNX spec).
    • EstimateNodeOutputSizeInBytes now takes const Graph& and dispatches to the new estimator for ConstantOfShape, falling back to the generic shape-based path if the initializer can't be resolved.
  • onnxruntime/test/optimizer/graph_transform_test.cc

    • ConstantFoldingConstantOfShapeUsesInputInitializerForSizeCheck: shape [100M] with int64 value ⇒ 800 MB derived size; with kOrtSessionOptionsConstantFoldingMaxOutputSizeInBytes=256MB the node must remain unfolded, proving the size check fires from the initializer alone.
    • ConstantFoldingConstantOfShapeBlockedWhenOutputShapeMissing: same model, but the pre_graph_checker (which runs after Graph::Resolve() and before the transformer) calls ClearShape() on the ConstantOfShape output NodeArg to simulate the documented attack where shape inference has not propagated the output shape. With the inferred shape stripped, the generic shape-based estimator returns -1, so only the new EstimateConstantOfShapeOutputSizeInBytes path can derive the 800 MB size and block folding — isolating regressions in the new estimator from the pre-existing shape-inference path.

Motivation and Context

The byte cap added in #28055 only triggers when shape inference has propagated the output shape; a crafted model can bypass it and force unbounded allocation during Initialize(). Deriving the size from the (necessarily constant) shape input makes the cap effective for the documented attack vector and tightens the same code path used by the configurable kOrtSessionOptionsConstantFoldingMaxOutputSizeInBytes setting — no new knob, no behavior change for legitimate models within the existing 1 GB default.

Copilot AI changed the title [WIP] Fix uncontrolled memory allocation in ConstantFolding optimizer Pre-check ConstantOfShape output size against input initializer before constant folding Jun 2, 2026
Copilot AI requested a review from xadupre June 2, 2026 16:21
@xadupre xadupre marked this pull request as ready for review June 2, 2026 16:46

@tianleiwu tianleiwu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Pre-check ConstantOfShape output size against input initializer

Well-targeted hardening of the constant-folding size cap. The approach is sound and the failure modes are all safe:

Strengths

  • Deriving the byte size from the (necessarily constant) shape input closes the documented bypass where shape inference hadn't propagated output_def->Shape(). Because AllNodeInputsAreConstant has already passed by the time the estimator runs, GetConstantInitializer is guaranteed to resolve the shape input on the real path.
  • All adversarial paths degrade safely: negative dims and a non-INT64 shape tensor return -1 (→ skip folding), SafeInt<int64_t> multiplication throws on overflow and is caught by the existing try/catch at the call site (→ skip folding), and an Initializer construction failure on external data is also covered by that same catch. So PB-scale fuzz-mutated dims can never reach the allocation.
  • Reuses the existing GetElementSizeForConstantFolding helper and correctly defaults the element type to float per the ONNX ConstantOfShape spec when value is absent.
  • No new option, no behavior change for legitimate models within the default cap.

Non-blocking suggestions (see inline):

  1. The new test likely also passes via the pre-existing shape-inference path, so it doesn't strictly isolate the new estimator — worth a note or a stronger assertion.
  2. Minor: redundant SafeInt re-wrap on the final multiplication.

No correctness or security concerns. LGTM with the minor suggestions above.

Comment thread onnxruntime/test/optimizer/graph_transform_test.cc
Comment thread onnxruntime/core/optimizer/constant_folding.cc Outdated
@xadupre xadupre requested a review from tianleiwu June 3, 2026 15:11
@titaiwangms

Copy link
Copy Markdown
Contributor

Review — ConstantOfShape constant-folding size pre-check

Verdict: The code fix is correct and security-sound — the estimate can never under-count the real kernel allocation. The main thing worth fixing is test efficacy: the two new tests pass even with the new estimator removed, so they don't actually prove the feature works.

Verified against the ONNX ConstantOfShape spec and the CPU kernel: the output element type defaults to float32 / is taken from the value attribute exactly as the kernel sizes its allocation (constant_of_shape.cc:88-89), SafeInt overflow on the dim product is caught upstream and skips folding, and the AllNodeInputsAreConstant invariant guarantees the shape input is a resolved constant by the time the estimator runs.

🟠 Major — The new tests don't isolate the new estimator

Both new tests pass even if EstimateConstantOfShapeOutputSizeInBytes is deleted:

  • ...UsesInputInitializerForSizeCheck (shape present): the constant [100M] shape makes ONNX shape inference populate the output NodeArg as [104857600], so the pre-existing generic estimator already computes 100M * 8 = 800 MB > 256 MB and blocks folding — without the new path.
  • ...BlockedWhenOutputShapeMissing (shape cleared): with the output shape stripped, the generic estimator returns -1, and the call site already treats estimated_size < 0 as skip folding (constant_folding.cc:410-415, from fix(security): add SafeInt overflow protection in Expand and constant folding output size limit #28055). Same observable outcome (ConstantOfShape == 1).

So the PR description's claim that the second test makes the new estimator "the only thing that can still block folding" isn't accurate — the pre-existing -1 => skip path also blocks. The second test does catch an under-count regression (estimator returns a positive value below the cap), which is valuable, but a regression where the estimator silently returns -1 (feature quietly lost) would go undetected.

Suggested fix: add a positive isolation test — cleared output shape + an under-cap size (e.g. shape [4], int64 value => 32 bytes, cap 256 MB). Without the new estimator the generic path returns -1 => skip fold (count == 1); with it the estimator returns 32 => folds (count == 0). Asserting count == 0 proves the estimator produced a valid positive size rather than -1 — which neither current test does.

🟡 Minor

  • Initializer materialized before the cap. Initializer shape_data{graph, *shape_init, graph.ModelPath()} reads/copies the shape initializer (including external data) before any size validation. As defense-in-depth on a DoS-hardening path, a crafted large/external shape initializer could shift cost into this read. It's bounded in practice (the shape tensor length is the output rank, is guaranteed constant, and is read anyway during folding), but a cheap precheck on data_type() / element count before constructing Initializer would be reasonable hardening.
  • value present-but-unknown type falls back to float. If value exists but its data_type yields element size 0, the code defaults to sizeof(float). That fallback is spec-correct only when value is absent. This is not reachable for valid graphs (ONNX type inference forces the output type to equal the value type, and unknown types fail kernel creation), so it's optional hardening: consider returning -1 when a present value can't yield a known element size, instead of defaulting to float.
  • Missing default-type test coverage. Both tests use an explicit int64 value. A case with no value attribute (=> float path) plus a cleared output shape would cover the default-type branch of the new estimator.
  • Stale docstring. The comment above EstimateNodeOutputSizeInBytes still says it estimates "using shape inference results," which is now misleading for the ConstantOfShape branch that deliberately avoids shape inference.

🔵 Nits

  • es -> attr_element_size for clarity.
  • The ~15-line build_model lambda and kLargeDim constant are duplicated verbatim across both tests; extracting a shared helper would make it obvious the two tests use the same model.
  • The check_outer_scope and fall-through comments could be tightened (state the single real reason for -1 rather than "e.g.").

✅ Praise

  • The security-relevant direction is provably safe: for every kernel-supported output type GetElementSizeForConstantFolding returns the exact byte width the kernel allocates, and sub-byte types over-count and are kernel-rejected anyway.
  • Correct, idiomatic use of Initializer + DataAsSpan<int64_t>() with a matching data_type() guard, and correct reliance on the AllNodeInputsAreConstant invariant.
  • SafeInt<int64_t> num_elements = 1 from the start means every *= in the loop is overflow-checked, not just the final product.
  • The ClearShape() test design is a good instinct for isolating the new path — it just needs the positive assertion described above to fully land.

Reviewed with a multi-model agent team (readability, code, adversarial, and deep spec reviewers); findings de-duplicated and adjudicated.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the ConstantFolding optimizer against crafted models that use ConstantOfShape with a constant shape initializer encoding extremely large dimensions, by estimating output size directly from the shape input initializer instead of relying on output shape inference being present.

Changes:

  • Add a ConstantOfShape-specific output-size estimator that reads the shape input initializer (Graph::GetConstantInitializer) and computes the byte size using SafeInt-protected arithmetic plus element size derived from the value attribute (defaulting to float per ONNX spec).
  • Update EstimateNodeOutputSizeInBytes to accept const Graph& and dispatch to the new estimator for ConstantOfShape.
  • Add regression tests ensuring folding is blocked even when the output NodeArg shape is cleared to simulate missing shape-inference propagation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
onnxruntime/core/optimizer/constant_folding.cc Adds initializer-based size estimation for ConstantOfShape and wires it into constant folding’s pre-execution size cap.
onnxruntime/test/optimizer/graph_transform_test.cc Adds tests validating the new estimator blocks folding based on the shape initializer, even when output shape metadata is missing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/constant_folding.cc
@tianleiwu tianleiwu merged commit 5597568 into main Jun 3, 2026
86 checks passed
@tianleiwu tianleiwu deleted the copilot/fix-uncontrolled-memory-allocation branch June 3, 2026 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security] Uncontrolled memory allocation in ConstantFolding optimizer via ConstantOfShape (OOM during Initialize())

5 participants