Skip to content

Harden OneHot operator input validation and output size computation#28014

Merged
GopalakrishnanN merged 5 commits into
microsoft:mainfrom
GopalakrishnanN:FixDOSAttack
May 11, 2026
Merged

Harden OneHot operator input validation and output size computation#28014
GopalakrishnanN merged 5 commits into
microsoft:mainfrom
GopalakrishnanN:FixDOSAttack

Conversation

@GopalakrishnanN

@GopalakrishnanN GopalakrishnanN commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Harden OneHot operator input validation and output size computation

Summary

This PR tightens input validation and output-shape computation for the OneHot operator on the CPU and CUDA execution providers so that invalid or extreme inputs fail cleanly with an INVALID_ARGUMENT status instead of triggering overflow, silent truncation, or division by zero.

Changes

  1. Overflow check in PrepareOutputShape using SafeInt

    • The output tensor size computation now uses a checked multiplication over the output dimensions, and the prefix_dim_size multiplication loop is rewritten with SafeInt<int64_t>. This prevents unbounded allocation attempts when a large depth value (or large indices shape) would overflow int64_t.
  2. Guard against division by zero when prefix_dim_size is zero

    • suffix_dim_size is now computed as (prefix_dim_size > 0) ? (indices_shape.Size() / prefix_dim_size) : 0, avoiding an integer division by zero when an indices dimension before the axis is zero.
  3. CUDA int32 range validation before fast_divmod

    • Before narrowing to int for fast_divmod (which requires int32 operands), the kernel now validates that suffix_dim_size and depth_val * suffix_dim_size fit in int32_t. Previously gsl::narrow_cast<int> could silently truncate, producing wrong divmod operands.
  4. nullptr check on Output() in both CPU and CUDA Compute paths

    • If allocation of the output tensor fails (e.g., because the requested shape is too large for the allocator), both kernels now return a descriptive error instead of dereferencing a null pointer.
  5. Unit tests in onnxruntime/test/providers/cpu/tensor/onehot_op_test.cc:

    • DepthTooLarge_OutputSizeOverflowdepth = INT64_MAX, indices = [2,3].
    • DepthTooLarge_OutputSizeOverflow_LargeIndicesdepth = INT64_MAX / 500, indices = [1000].
    • NegativeDepth — negative depth is rejected.
    • DepthOne — minimum valid depth = 1 edge case.
    • ScalarIndicesRejected — rank-0 indices are rejected per the ONNX spec (indices rank ≥ 1).
    • DefaultAxis_Opset9 — opset 9 coverage for the default-axis path.
    • The three negative-path tests exclude kTensorrtExecutionProvider and kDmlExecutionProvider because those EPs fail with their own (different) error messages before our kernel validation runs.

Files touched

onnxruntime/core/providers/cpu/tensor/onehot.cc        | 28 +++++++-
onnxruntime/core/providers/cuda/tensor/onehot.cc       | 13 ++++
onnxruntime/test/providers/cpu/tensor/onehot_op_test.cc | 83 ++++++++++++++++++++++

Testing

Built onnxruntime_provider_test on Windows (Release) and ran OneHotOpTest.*:

[==========] 34 tests from 1 test suite ran.
[  PASSED  ] 34 tests.

Overflow tests produce the expected error, for example:

OneHot: output tensor size would overflow for the given indices shape and depth value (9223372036854775807).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Addresses a reported DoS risk in the OneHot operator by adding shape-size validation and extra allocation guarding to reduce the chance of oversized/overflowing output allocations during execution (CPU/CUDA).

Changes:

  • Add an int64 element-count overflow check in PrepareOutputShape() for OneHot.
  • Add output allocation null checks in CPU and CUDA OneHot compute paths.
  • Add new unit tests intended to verify overflow rejection behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
onnxruntime/core/providers/cpu/tensor/onehot.cc Adds output element-count overflow validation and an output allocation null guard.
onnxruntime/core/providers/cuda/tensor/onehot.cc Adds an output allocation null guard in the CUDA compute path.
onnxruntime/test/providers/cpu/tensor/onehot_op_test.cc Adds two new failure-mode tests for extremely large depth values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/test/providers/cpu/tensor/onehot_op_test.cc
Comment thread onnxruntime/core/providers/cpu/tensor/onehot.cc Outdated
Comment thread onnxruntime/core/providers/cpu/tensor/onehot.cc
Comment thread onnxruntime/core/providers/cuda/tensor/onehot.cc
@GopalakrishnanN GopalakrishnanN changed the title Fix OneHot depth amplification DoS vulnerability Harden OneHot operator input validation and output size computation Apr 16, 2026
@GopalakrishnanN GopalakrishnanN force-pushed the FixDOSAttack branch 3 times, most recently from 3fcb761 to bfd8c8c Compare April 17, 2026 02:58
@GopalakrishnanN GopalakrishnanN requested a review from Copilot April 17, 2026 04:08

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/test/providers/cpu/tensor/onehot_op_test.cc Outdated
Comment thread onnxruntime/core/providers/cuda/tensor/onehot.cc Outdated
Comment thread onnxruntime/core/providers/cuda/tensor/onehot.cc Outdated
- Add overflow check in PrepareOutputShape using SafeInt for output size and prefix_dim_size multiplication to prevent unbounded allocation when depth or indices shape would overflow int64

- Guard against division by zero when prefix_dim_size is zero

- Add CUDA int32 range validation before fast_divmod to avoid silent truncation in gsl::narrow_cast for suffix_dim_size and depth_val * suffix_dim_size

- Check for nullptr from Output() in both CPU and CUDA Compute paths

- Add unit tests: depth overflow (two variants), negative depth, depth=1 edge case, scalar-indices rejection (ONNX spec requires rank>=1), and opset 9 coverage
Gopalakrishnan Nallasamy and others added 2 commits May 1, 2026 17:39
- Reject rank-0 indices in PrepareOutputShape (CPU and CUDA plugin shim) per ONNX spec.

- Mirror overflow / SafeInt prefix / div-by-zero guards in the CUDA plugin shim PrepareOutputShape.

- Add <algorithm> include in CUDA onehot.cc for std::max.

- Add core/common/safeint.h include in cuda_kernel_adapter.h for SafeInt.

- Loosen ScalarIndicesRejected expected substring to match both ONNX shape-inference and kernel-level errors.
tianleiwu
tianleiwu previously approved these changes May 7, 2026

@tianleiwu tianleiwu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Well-targeted hardening of the OneHot operator across CPU and CUDA execution providers. The changes close real overflow and null-dereference vectors that could be triggered by adversarial model inputs. The implementation is correct, the test coverage is solid, and the code follows ORT conventions.

Positives:

  • Manual overflow check for total output element count produces clear INVALID_ARGUMENT errors — good UX for operators accepting untrusted model data.
  • SafeInt<int64_t> for prefix multiplication is proper belt-and-suspenders defense.
  • CUDA int32 range validation before gsl::narrow_cast<int> is critical — silent truncation would produce wrong fast_divmod operands.
  • Null output check correctly placed before output->Shape().Size() dereference on both CPU and CUDA paths.
  • Comprehensive test coverage: overflow, negative depth, minimum depth, scalar rejection, and opset 9 backward compat.

Two minor suggestions below (non-blocking).

Comment thread onnxruntime/core/providers/cuda/tensor/onehot.cc Outdated
Comment thread onnxruntime/core/providers/cpu/tensor/onehot.cc

@tianleiwu tianleiwu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small build-hygiene nit below. The rest of the hardening looks good on the current head.

Comment thread onnxruntime/core/providers/cuda/plugin/cuda_kernel_adapter.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants