Harden OneHot operator input validation and output size computation#28014
Conversation
There was a problem hiding this comment.
Pull request overview
Addresses a reported DoS risk in the OneHot operator by adding shape-size validation and extra allocation guarding to reduce the chance of oversized/overflowing output allocations during execution (CPU/CUDA).
Changes:
- Add an int64 element-count overflow check in
PrepareOutputShape()for OneHot. - Add output allocation null checks in CPU and CUDA OneHot compute paths.
- Add new unit tests intended to verify overflow rejection behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| onnxruntime/core/providers/cpu/tensor/onehot.cc | Adds output element-count overflow validation and an output allocation null guard. |
| onnxruntime/core/providers/cuda/tensor/onehot.cc | Adds an output allocation null guard in the CUDA compute path. |
| onnxruntime/test/providers/cpu/tensor/onehot_op_test.cc | Adds two new failure-mode tests for extremely large depth values. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0d2a9e2 to
0c704ce
Compare
3fcb761 to
bfd8c8c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
bfd8c8c to
7dc5aa7
Compare
- Add overflow check in PrepareOutputShape using SafeInt for output size and prefix_dim_size multiplication to prevent unbounded allocation when depth or indices shape would overflow int64 - Guard against division by zero when prefix_dim_size is zero - Add CUDA int32 range validation before fast_divmod to avoid silent truncation in gsl::narrow_cast for suffix_dim_size and depth_val * suffix_dim_size - Check for nullptr from Output() in both CPU and CUDA Compute paths - Add unit tests: depth overflow (two variants), negative depth, depth=1 edge case, scalar-indices rejection (ONNX spec requires rank>=1), and opset 9 coverage
7dc5aa7 to
57d4b03
Compare
- Reject rank-0 indices in PrepareOutputShape (CPU and CUDA plugin shim) per ONNX spec. - Mirror overflow / SafeInt prefix / div-by-zero guards in the CUDA plugin shim PrepareOutputShape. - Add <algorithm> include in CUDA onehot.cc for std::max. - Add core/common/safeint.h include in cuda_kernel_adapter.h for SafeInt. - Loosen ScalarIndicesRejected expected substring to match both ONNX shape-inference and kernel-level errors.
tianleiwu
left a comment
There was a problem hiding this comment.
Review Summary
Well-targeted hardening of the OneHot operator across CPU and CUDA execution providers. The changes close real overflow and null-dereference vectors that could be triggered by adversarial model inputs. The implementation is correct, the test coverage is solid, and the code follows ORT conventions.
Positives:
- Manual overflow check for total output element count produces clear
INVALID_ARGUMENTerrors — good UX for operators accepting untrusted model data. SafeInt<int64_t>for prefix multiplication is proper belt-and-suspenders defense.- CUDA int32 range validation before
gsl::narrow_cast<int>is critical — silent truncation would produce wrongfast_divmodoperands. - Null output check correctly placed before
output->Shape().Size()dereference on both CPU and CUDA paths. - Comprehensive test coverage: overflow, negative depth, minimum depth, scalar rejection, and opset 9 backward compat.
Two minor suggestions below (non-blocking).
tianleiwu
left a comment
There was a problem hiding this comment.
One small build-hygiene nit below. The rest of the hardening looks good on the current head.
Harden OneHot operator input validation and output size computation
Summary
This PR tightens input validation and output-shape computation for the
OneHotoperator on the CPU and CUDA execution providers so that invalid or extreme inputs fail cleanly with anINVALID_ARGUMENTstatus instead of triggering overflow, silent truncation, or division by zero.Changes
Overflow check in
PrepareOutputShapeusingSafeIntprefix_dim_sizemultiplication loop is rewritten withSafeInt<int64_t>. This prevents unbounded allocation attempts when a largedepthvalue (or largeindicesshape) would overflowint64_t.Guard against division by zero when
prefix_dim_sizeis zerosuffix_dim_sizeis now computed as(prefix_dim_size > 0) ? (indices_shape.Size() / prefix_dim_size) : 0, avoiding an integer division by zero when an indices dimension before the axis is zero.CUDA int32 range validation before
fast_divmodintforfast_divmod(which requiresint32operands), the kernel now validates thatsuffix_dim_sizeanddepth_val * suffix_dim_sizefit inint32_t. Previouslygsl::narrow_cast<int>could silently truncate, producing wrong divmod operands.nullptrcheck onOutput()in both CPU and CUDAComputepathsUnit tests in
onnxruntime/test/providers/cpu/tensor/onehot_op_test.cc:DepthTooLarge_OutputSizeOverflow—depth = INT64_MAX,indices = [2,3].DepthTooLarge_OutputSizeOverflow_LargeIndices—depth = INT64_MAX / 500,indices = [1000].NegativeDepth— negative depth is rejected.DepthOne— minimum validdepth = 1edge case.ScalarIndicesRejected— rank-0 indices are rejected per the ONNX spec (indices rank ≥ 1).DefaultAxis_Opset9— opset 9 coverage for the default-axis path.kTensorrtExecutionProviderandkDmlExecutionProviderbecause those EPs fail with their own (different) error messages before our kernel validation runs.Files touched
Testing
Built
onnxruntime_provider_teston Windows (Release) and ranOneHotOpTest.*:Overflow tests produce the expected error, for example: