Skip to content

Validate token_id bounds in NGramRepeatBlock to prevent OOB write#28039

Merged
vraspar merged 2 commits into
mainfrom
vraspar/fix-ngram-repeat-block-oob-write
Apr 16, 2026
Merged

Validate token_id bounds in NGramRepeatBlock to prevent OOB write#28039
vraspar merged 2 commits into
mainfrom
vraspar/fix-ngram-repeat-block-oob-write

Conversation

@vraspar
Copy link
Copy Markdown
Contributor

@vraspar vraspar commented Apr 10, 2026

Description

In NGramRepeatBlock (CPU and CUDA EP), token values from the input_ids tensor are used directly as array indices into the scores output buffer without adequate bounds checking. The CPU path only checked token_id < vocab_size (missing lower bound), and the CUDA kernel had no bounds checks at all. A crafted model with negative token IDs can write at attacker-controlled negative offsets, causing heap corruption or SIGSEGV.

Fixes https://portal.microsofticm.com/imp/v5/incidents/details/31000000558069/summary

Changes

  • ngram_repeat_block.h (CPU): Replace ORT_ENFORCE(token_id < vocab_size) with full [0, vocab_size) bounds check returning INVALID_ARGUMENT Status via atomic error flag (avoids abort() under ORT_NO_EXCEPTIONS)
  • ngram_repeat_block_impl.cu (CUDA): Add CUDA_KERNEL_ASSERT + bounds guard with skip for release safety

Tests

2 regression tests: negative token_id, token_id >= vocab_size (CPU only, CUDA EP excluded to avoid debug assert context corruption).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the NGramRepeatBlock operator (CPU and CUDA EP) against out-of-bounds writes by validating token_id values derived from input_ids before using them as indices into the scores buffer, addressing a security issue caused by negative or oversized token IDs.

Changes:

  • CPU: replace ORT_ENFORCE(token_id < vocab_size) with a full [0, vocab_size) check and return INVALID_ARGUMENT instead of aborting.
  • CUDA: add CUDA_KERNEL_ASSERT plus a runtime bounds guard to avoid invalid indexing in release builds.
  • Tests: add CPU-only regression tests covering negative and oversized token_id cases.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
onnxruntime/contrib_ops/cpu/bert/ngram_repeat_block.h Adds bounds validation for token_id and returns an INVALID_ARGUMENT Status on invalid input.
onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu Adds debug assertion + runtime guard to prevent OOB write in the CUDA kernel.
onnxruntime/test/contrib_ops/ngram_repeat_block_op_test.cc Adds regression tests for invalid token_id values on CPU EP.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/contrib_ops/cpu/bert/ngram_repeat_block.h
tianleiwu
tianleiwu previously approved these changes Apr 13, 2026
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, well-scoped security fix. The CPU path correctly replaces the crashing ORT_ENFORCE (missing lower-bound check, abort() under ORT_NO_EXCEPTIONS) with a proper INVALID_ARGUMENT Status return using thread-safe atomic error collection. The CUDA path uses the standard ORT dual pattern of CUDA_KERNEL_ASSERT + conditional guard. Regression tests cover both failure modes (negative token_id and token_id ≥ vocab_size).

One minor suggestion on the CUDA side (inline comment below) — non-blocking.

Comment thread onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vraspar vraspar requested a review from tianleiwu April 13, 2026 20:37
@vraspar vraspar enabled auto-merge (squash) April 13, 2026 20:38
@vraspar vraspar merged commit 5192b90 into main Apr 16, 2026
104 of 110 checks passed
@vraspar vraspar deleted the vraspar/fix-ngram-repeat-block-oob-write branch April 16, 2026 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants