Fix GatherCopyData Integer Truncation Leading to Heap Out-of-Bounds Read/Write by chilo-ms · Pull Request #27444 · microsoft/onnxruntime

chilo-ms · 2026-02-24T21:10:51Z

Description

This pull request improves the robustness and correctness of the CPU implementation of the Gather operator in ONNX Runtime. The key changes focus on preventing integer overflow issues in parallel processing and output shape calculations, as well as enhancing test coverage to verify these safeguards.

Enhancements to overflow handling and parallel processing:

Changed the lambda function in GatherCopyData to use ptrdiff_t instead of int64_t for the index, and explicitly cast batch and i variables, ensuring safer arithmetic for large tensor sizes.
Updated the parallel loop in GatherCopyData to iterate using ptrdiff_t indices, preventing potential overflow when processing large tensors.

Testing improvements:

Added a new unit test Gather_overflow_check in gather_op_test.cc to verify that the Gather operator correctly handles very large output shapes without overflowing, specifically testing dimensions that exceed the 32-bit integer limit.

Motivation and Context

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/test/providers/cpu/tensor/gather_op_test.cc

adrianlizarraga · 2026-02-24T21:28:32Z

Thanks @chilo-ms . There's a linter error but otherwise looks good to me!

…ead/Write (#27444) ### Description This pull request improves the robustness and correctness of the CPU implementation of the Gather operator in ONNX Runtime. The key changes focus on preventing integer overflow issues in parallel processing and output shape calculations, as well as enhancing test coverage to verify these safeguards. Enhancements to overflow handling and parallel processing: * Changed the lambda function in `GatherCopyData` to use `ptrdiff_t` instead of `int64_t` for the index, and explicitly cast batch and i variables, ensuring safer arithmetic for large tensor sizes. * Updated the parallel loop in `GatherCopyData` to iterate using `ptrdiff_t` indices, preventing potential overflow when processing large tensors. Testing improvements: * Added a new unit test `Gather_overflow_check` in `gather_op_test.cc` to verify that the Gather operator correctly handles very large output shapes without overflowing, specifically testing dimensions that exceed the 32-bit integer limit. ### Motivation and Context

This cherry-picks the following commits for the release: | Commit ID | PR Number | Commit Title | |-----------|-----------|-------------| | decd177 | #27090 | Fix GatherND division by zero when batch dimensions mismatch | | 55f8234 | #27360 | Fix QMoE CPU Operator | | df9146f | #27403 | [MLAS] Adding DynamicQGemm function pointers and ukernel interface | | 0f93853 | #27318 | [js/web] Use embedded WASM module in Blob URL workers when wasmBinary is provided | | b2a6e69 | #27364 | QMoE CPU Performance Update (Up to 4x on 4-bit) | | f501e1d | #27413 | Fix refcount bug in map input conversion that caused shutdown segfault | | b32b205 | #27421 | Fix error where bytes is not assigned for dynamic qgemm pack b size | | 426b006 | #27397 | Fix DllImportResolver | | 0982844 | #27412 | MatmulNBits prepacking scales fix | | 9afb0d2 | #27430 | Fix validation for external data paths for models loaded from bytes | | 71d2cd0 | #27401 | Enable Python 3.14 CI and Upgrade Dependencies | | 79e0676 | #27419 | fix: out of bounds access for resize operation | | 82eb99c | #27459 | Fix SkipLayerNorm fusion incorrectly applied when gamma/beta are not 1D | | 355278a | #27444 | Fix GatherCopyData Integer Truncation Leading to Heap Out-of-Bounds Read/Write | | cf96123 | #27411 | [web] fix usage of wasmBinary together with a blob URL for .mjs | | 1131a86 | #27399 | [web] remove the unhelpful "Unknown CPU vendor" warning. | | ffbbc4f | #27316 | Build Windows ARM64X binaries as part of packaging pipeline | --------- Signed-off-by: Jonathan Clohessy <Jonathan.Clohessy@arm.com> Co-authored-by: patryk-kaiser-ARM <patryk.kaiser@arm.com> Co-authored-by: don <70039285+0-don@users.noreply.github.com> Co-authored-by: Jonathan Clohessy <jonathan.clohessy@arm.com> Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com> Co-authored-by: Lukas Folle <126877803+lukas-folle-snkeos@users.noreply.github.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com> Co-authored-by: Chaya <cha182350@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Erik <erscor@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>

#27483) ### Description This PR reduces the size of the memory allocation for expected outputs from ~4GiB to ~2GiB in the Gather_overflow_check test. The updated test still verifies that the integer overflow fix from PR #27444 is valid. That is, that the CPU Gather operator correctly handles output tensors with element counts that exceed INT32_MAX. Changes: - Reduced test dimension from 65537 to 46341 (output shape from 65537×65537 to 46341×46341), which results in a total number of elements that is just over INT32_MAX, which is required to test the bug fix. - The peak memory usage is reduced to ~4GiB + overhead. - Increase Android emulator memory to 5GiB (from 4GiB) to be able to run the test. ### Motivation Android CI fails to run the unit test introduced in #27444 due to memory usage that exceeds the Android emulator's default memory of 4GiB. This PR lowers the peak memory usage of the unit test and increases the Android emulator's memory by 1GiB.

…rite (#27544) ### Description  This pull request refactors several tensor operation kernels (`GatherND`, `ScatterND`, and `GatherGrad`) to improve type safety and consistency in parallelized code execution. The main change is replacing `int` loop indices with `ptrdiff_t` to avoid overflow. ### Parallelization and Type Safety Improvements * Updated lambda functions and parallel loop indices in `gather_nd.cc` (`GatherNDBase::PrepareForCompute`, `GatherND::GatherNumber`, and `GatherND::GatherString`) to use `ptrdiff_t` instead of `int64_t`, and replaced index arithmetic with explicit casts to maintain correctness. [[1]](diffhunk://#diff-a456934cd8ef2c51197e04af32ecbef5b531dae83f7f8c2aca46802b7a5e7b7bL96-R100) [[2]](diffhunk://#diff-a456934cd8ef2c51197e04af32ecbef5b531dae83f7f8c2aca46802b7a5e7b7bL121-R121) [[3]](diffhunk://#diff-a456934cd8ef2c51197e04af32ecbef5b531dae83f7f8c2aca46802b7a5e7b7bL192-R216) * Refactored `scatter_nd.cc` (`ScatterNDDispatchTarget`) to use `ptrdiff_t` for loop indices and index arithmetic in all reduction cases, ensuring consistent type usage in parallel execution. * Modified `gather_grad.cc` (`GatherGrad::ComputeImpl`) to use `ptrdiff_t` for parallel loop indices, aligning with the changes in other tensor kernels. ### Motivation and Context  Another same issue was fixed in #27444

chilo-ms added 2 commits February 24, 2026 12:45

update

7632394

add comment

7617bbc

chilo-ms requested a review from adrianlizarraga February 24, 2026 21:11

github-actions bot reviewed Feb 24, 2026

View reviewed changes

onnxruntime/test/providers/cpu/tensor/gather_op_test.cc Outdated Show resolved Hide resolved

onnxruntime/test/providers/cpu/tensor/gather_op_test.cc Outdated Show resolved Hide resolved

adrianlizarraga requested a review from tianleiwu February 24, 2026 21:21

skip on 32-bit platform

3ca8e89

adrianlizarraga approved these changes Feb 25, 2026

View reviewed changes

tianleiwu approved these changes Feb 26, 2026

View reviewed changes

tianleiwu added the release:1.24.3 label Feb 26, 2026

chilo-ms merged commit 355278a into main Feb 26, 2026
89 of 93 checks passed

chilo-ms deleted the chi/fix_oob_read_write branch February 26, 2026 16:46

tianleiwu mentioned this pull request Feb 27, 2026

ORT 1.24.3 release cherry pick round 1 #27476

Merged

adrianlizarraga mentioned this pull request Feb 27, 2026

Reduce allocation size in test Gather_oveflow_check from 4GiB to >2GiB #27483

Merged

tianleiwu removed the release:1.24.3 label Feb 28, 2026

chilo-ms mentioned this pull request Mar 4, 2026

Fix Potential Integer Truncation Leading to Heap Out-of-Bounds Read/Write #27544

Merged

This was referenced Mar 9, 2026

Bump Microsoft.ML.OnnxRuntime.Gpu from 1.23.2 to 1.24.3 yuniko-software/bge-m3-onnx#66

Closed

deps(nuget): Bump the microsoft-packages group with 2 updates Ellerbach/azure-ai-search-simulator#73

Closed

dependabot bot mentioned this pull request Mar 16, 2026

deps(nuget): Bump the microsoft-packages group with 8 updates Ellerbach/azure-ai-search-simulator#76

Closed

tianleiwu mentioned this pull request Mar 16, 2026

ORT 1.24.4 release cherry pick round 1 #27682

Merged

dependabot bot mentioned this pull request Mar 30, 2026

Bump Microsoft.ML.OnnxRuntime from 1.20.1 to 1.24.4 PMeeske/ouroboros-engine#94

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GatherCopyData Integer Truncation Leading to Heap Out-of-Bounds Read/Write #27444

Fix GatherCopyData Integer Truncation Leading to Heap Out-of-Bounds Read/Write #27444
chilo-ms merged 3 commits intomainfrom
chi/fix_oob_read_write

chilo-ms commented Feb 24, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Uh oh!

adrianlizarraga commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chilo-ms commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adrianlizarraga commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chilo-ms commented Feb 24, 2026 •

edited

Loading