Validate token_id bounds in NGramRepeatBlock to prevent OOB write by vraspar · Pull Request #28039 · microsoft/onnxruntime

vraspar · 2026-04-10T22:40:38Z

Description

In NGramRepeatBlock (CPU and CUDA EP), token values from the input_ids tensor are used directly as array indices into the scores output buffer without adequate bounds checking. The CPU path only checked token_id < vocab_size (missing lower bound), and the CUDA kernel had no bounds checks at all. A crafted model with negative token IDs can write at attacker-controlled negative offsets, causing heap corruption or SIGSEGV.

Fixes https://portal.microsofticm.com/imp/v5/incidents/details/31000000558069/summary

Changes

ngram_repeat_block.h (CPU): Replace ORT_ENFORCE(token_id < vocab_size) with full [0, vocab_size) bounds check returning INVALID_ARGUMENT Status via atomic error flag (avoids abort() under ORT_NO_EXCEPTIONS)
ngram_repeat_block_impl.cu (CUDA): Add CUDA_KERNEL_ASSERT + bounds guard with skip for release safety

Tests

2 regression tests: negative token_id, token_id >= vocab_size (CPU only, CUDA EP excluded to avoid debug assert context corruption).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR hardens the NGramRepeatBlock operator (CPU and CUDA EP) against out-of-bounds writes by validating token_id values derived from input_ids before using them as indices into the scores buffer, addressing a security issue caused by negative or oversized token IDs.

Changes:

CPU: replace ORT_ENFORCE(token_id < vocab_size) with a full [0, vocab_size) check and return INVALID_ARGUMENT instead of aborting.
CUDA: add CUDA_KERNEL_ASSERT plus a runtime bounds guard to avoid invalid indexing in release builds.
Tests: add CPU-only regression tests covering negative and oversized token_id cases.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
`onnxruntime/contrib_ops/cpu/bert/ngram_repeat_block.h`	Adds bounds validation for `token_id` and returns an `INVALID_ARGUMENT` `Status` on invalid input.
`onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu`	Adds debug assertion + runtime guard to prevent OOB write in the CUDA kernel.
`onnxruntime/test/contrib_ops/ngram_repeat_block_op_test.cc`	Adds regression tests for invalid `token_id` values on CPU EP.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tianleiwu

Clean, well-scoped security fix. The CPU path correctly replaces the crashing ORT_ENFORCE (missing lower-bound check, abort() under ORT_NO_EXCEPTIONS) with a proper INVALID_ARGUMENT Status return using thread-safe atomic error collection. The CUDA path uses the standard ORT dual pattern of CUDA_KERNEL_ASSERT + conditional guard. Regression tests cover both failure modes (negative token_id and token_id ≥ vocab_size).

One minor suggestion on the CUDA side (inline comment below) — non-blocking.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Validate token_id bounds in NGramRepeatBlock to prevent OOB write

3547f0e

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

vraspar requested review from Copilot and tianleiwu April 10, 2026 22:41

Copilot started reviewing on behalf of vraspar April 10, 2026 22:43 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Comment thread onnxruntime/contrib_ops/cpu/bert/ngram_repeat_block.h

tianleiwu previously approved these changes Apr 13, 2026

View reviewed changes

Comment thread onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu

Add clarifying comment for CUDA bounds guard per review

749f7a0

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

vraspar dismissed tianleiwu’s stale review via 749f7a0 April 13, 2026 20:37

vraspar requested a review from tianleiwu April 13, 2026 20:37

vraspar enabled auto-merge (squash) April 13, 2026 20:38

tianleiwu approved these changes Apr 14, 2026

View reviewed changes

vraspar merged commit 5192b90 into main Apr 16, 2026
104 of 110 checks passed

vraspar deleted the vraspar/fix-ngram-repeat-block-oob-write branch April 16, 2026 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate token_id bounds in NGramRepeatBlock to prevent OOB write#28039

Validate token_id bounds in NGramRepeatBlock to prevent OOB write#28039
vraspar merged 2 commits into
mainfrom
vraspar/fix-ngram-repeat-block-oob-write

vraspar commented Apr 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

tianleiwu left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vraspar commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

tianleiwu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vraspar commented Apr 10, 2026 •

edited

Loading