Validate g_idx values in MatMulNBits to prevent OOB read#27582
Conversation
There was a problem hiding this comment.
Pull request overview
Adds input validation for the deprecated g_idx (group index) input to MatMulNBits to prevent out-of-bounds reads when it is used to index into per-block scales/zero_points, and adds regression tests to ensure invalid indices are rejected.
Changes:
- Add range validation for
group_indexvalues inmatmul_nbits_helper::CheckInputs([0, k_blocks)). - Add unit tests that expect
INVALID_ARGUMENTon negative and out-of-rangeg_idxvalues.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| onnxruntime/contrib_ops/cpu/quantization/matmul_nbits_helper.h | Adds g_idx value-range validation in shared input checking used by CPU and CUDA MatMulNBits implementations. |
| onnxruntime/test/contrib_ops/matmul_4bits_test.cc | Adds two negative tests to verify invalid g_idx values are rejected. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…pe and update test for out-of-range g_idx values
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…date tests for out-of-range g_idx values
tianleiwu
left a comment
There was a problem hiding this comment.
Thanks for addressing this OOB read vulnerability — the CPU-side validation logic is well-structured with a clear error message. However, the CUDA EP path still has a gap in release builds.
See inline comments for details.
- Add rid clamping after CUDA_KERNEL_ASSERT in Dequantize4BitsKernelReOrder to prevent OOB memory access in release builds where the assert is a no-op - Remove unnecessary #ifdef NDEBUG guard around InvalidGIdx tests since CUDA EP is already excluded via OpTester::Run() parameters Addresses review feedback from tianleiwu. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
In
Dequantize4BitsKernelReOrder(CPU and CUDA EP), values from theg_idxtensor are used directly as array indices into thescalesandzero_pointsbuffers without bounds checking. This PR adds value-range validation and tests for theg_idxinput tensor in theMatMulNBitsoperator.Motivation and Context