Pre-check ConstantOfShape output size against input initializer before constant folding#28751
Conversation
tianleiwu
left a comment
There was a problem hiding this comment.
Review: Pre-check ConstantOfShape output size against input initializer
Well-targeted hardening of the constant-folding size cap. The approach is sound and the failure modes are all safe:
Strengths
- Deriving the byte size from the (necessarily constant) shape input closes the documented bypass where shape inference hadn't propagated
output_def->Shape(). BecauseAllNodeInputsAreConstanthas already passed by the time the estimator runs,GetConstantInitializeris guaranteed to resolve the shape input on the real path. - All adversarial paths degrade safely: negative dims and a non-INT64 shape tensor return
-1(→ skip folding),SafeInt<int64_t>multiplication throws on overflow and is caught by the existingtry/catchat the call site (→ skip folding), and anInitializerconstruction failure on external data is also covered by that same catch. So PB-scale fuzz-mutated dims can never reach the allocation. - Reuses the existing
GetElementSizeForConstantFoldinghelper and correctly defaults the element type tofloatper the ONNX ConstantOfShape spec whenvalueis absent. - No new option, no behavior change for legitimate models within the default cap.
Non-blocking suggestions (see inline):
- The new test likely also passes via the pre-existing shape-inference path, so it doesn't strictly isolate the new estimator — worth a note or a stronger assertion.
- Minor: redundant
SafeIntre-wrap on the final multiplication.
No correctness or security concerns. LGTM with the minor suggestions above.
Review — ConstantOfShape constant-folding size pre-checkVerdict: The code fix is correct and security-sound — the estimate can never under-count the real kernel allocation. The main thing worth fixing is test efficacy: the two new tests pass even with the new estimator removed, so they don't actually prove the feature works. Verified against the ONNX ConstantOfShape spec and the CPU kernel: the output element type defaults to float32 / is taken from the 🟠 Major — The new tests don't isolate the new estimatorBoth new tests pass even if
So the PR description's claim that the second test makes the new estimator "the only thing that can still block folding" isn't accurate — the pre-existing Suggested fix: add a positive isolation test — cleared output shape + an under-cap size (e.g. shape 🟡 Minor
🔵 Nits
✅ Praise
Reviewed with a multi-model agent team (readability, code, adversarial, and deep spec reviewers); findings de-duplicated and adjudicated. |
There was a problem hiding this comment.
Pull request overview
This PR hardens the ConstantFolding optimizer against crafted models that use ConstantOfShape with a constant shape initializer encoding extremely large dimensions, by estimating output size directly from the shape input initializer instead of relying on output shape inference being present.
Changes:
- Add a
ConstantOfShape-specific output-size estimator that reads the shape input initializer (Graph::GetConstantInitializer) and computes the byte size using SafeInt-protected arithmetic plus element size derived from thevalueattribute (defaulting to float per ONNX spec). - Update
EstimateNodeOutputSizeInBytesto acceptconst Graph&and dispatch to the new estimator forConstantOfShape. - Add regression tests ensuring folding is blocked even when the output
NodeArgshape is cleared to simulate missing shape-inference propagation.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| onnxruntime/core/optimizer/constant_folding.cc | Adds initializer-based size estimation for ConstantOfShape and wires it into constant folding’s pre-execution size cap. |
| onnxruntime/test/optimizer/graph_transform_test.cc | Adds tests validating the new estimator blocks folding based on the shape initializer, even when output shape metadata is missing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
A 152-byte ONNX model with a
ConstantOfShapewhose shape initializer encodes huge dims causesInferenceSession::Initialize()to materialize the full output tensor (287 MB in the PoC, up to PB-scale with fuzz-mutated dims) via the ConstantFolding optimizer. The existing pre-execution size cap inEstimateNodeOutputSizeInBytesrelies on shape inference having populatedoutput_def->Shape(), which is not guaranteed.Description
onnxruntime/core/optimizer/constant_folding.ccEstimateConstantOfShapeOutputSizeInBytes(node, graph): looks up the shape input viaGraph::GetConstantInitializer(it is constant by the time we reach this node), multiplies its int64 values withSafeInt<int64_t>(rejects negative dims, lets overflow propagate as an exception caught upstream), and multiplies by the element size derived from thevalueattribute's tensor type (defaulting to float per ONNX spec).EstimateNodeOutputSizeInBytesnow takesconst Graph&and dispatches to the new estimator forConstantOfShape, falling back to the generic shape-based path if the initializer can't be resolved.onnxruntime/test/optimizer/graph_transform_test.ccConstantFoldingConstantOfShapeUsesInputInitializerForSizeCheck: shape[100M]with int64value⇒ 800 MB derived size; withkOrtSessionOptionsConstantFoldingMaxOutputSizeInBytes=256MBthe node must remain unfolded, proving the size check fires from the initializer alone.ConstantFoldingConstantOfShapeBlockedWhenOutputShapeMissing: same model, but thepre_graph_checker(which runs afterGraph::Resolve()and before the transformer) callsClearShape()on the ConstantOfShape output NodeArg to simulate the documented attack where shape inference has not propagated the output shape. With the inferred shape stripped, the generic shape-based estimator returns -1, so only the newEstimateConstantOfShapeOutputSizeInBytespath can derive the 800 MB size and block folding — isolating regressions in the new estimator from the pre-existing shape-inference path.Motivation and Context
The byte cap added in #28055 only triggers when shape inference has propagated the output shape; a crafted model can bypass it and force unbounded allocation during
Initialize(). Deriving the size from the (necessarily constant) shape input makes the cap effective for the documented attack vector and tightens the same code path used by the configurablekOrtSessionOptionsConstantFoldingMaxOutputSizeInBytessetting — no new knob, no behavior change for legitimate models within the existing 1 GB default.