Fix -Warray-bounds build error in MLAS on clang 17+#27499
Merged
Conversation
…y-bounds in MLAS
The HAS_ARRAY_BOUNDS macro is defined in onnxruntime_config.h which is never
included by the MLAS compilation unit. Replace with clang's built-in
__has_warning("-Warray-bounds") so the pragma actually takes effect and
suppresses the false positive array-bounds error on clang 17+.
Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix build error for array index out of bounds
Fix -Warray-bounds build error in MLAS on clang 17+
Feb 28, 2026
Contributor
|
@copilot, please fix build errors in CI pipeline like in gcc 14: |
…ocessor error __has_warning is a clang-specific built-in that GCC doesn't recognize. The C preprocessor doesn't short-circuit &&, so GCC fails to parse the expression even with defined(__clang__) guarding it. Since -Warray-bounds is universally supported by all clang versions, a simple #ifdef __clang__ is sufficient. Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Contributor
Author
Fixed in |
Contributor
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline |
tianleiwu
approved these changes
Mar 1, 2026
|
Azure Pipelines successfully started running 4 pipeline(s). |
hariharans29
approved these changes
Mar 2, 2026
tianleiwu
added a commit
that referenced
this pull request
Mar 5, 2026
### Description
The `-Warray-bounds` suppression pragma in
`sqnbitgemm_kernel_avx2_int8_blklen32.h` was gated on
`defined(HAS_ARRAY_BOUNDS)`, which is set in `onnxruntime_config.h`.
MLAS never includes that header, so the guard was dead code and the
pragma never fired.
Changed the guard to `#ifdef __clang__`:
```cpp
// Before: HAS_ARRAY_BOUNDS never defined in MLAS TU
#if defined(__clang__) && defined(HAS_ARRAY_BOUNDS)
// After
#ifdef __clang__
```
Note: `__has_warning("-Warray-bounds")` was considered but the C
preprocessor does not short-circuit `&&`, so GCC fails to parse it even
behind `defined(__clang__)`.
### Motivation and Context
Build fails on Intel Mac with Apple Clang 17.0.0
(`-Werror,-Warray-bounds`). Clang raises a false-positive array-bounds
warning on `acc[4..7]` inside an `if constexpr (NCols4 == 8)` branch
that is dead when `NCols4 == 4`.
<!-- START COPILOT ORIGINAL PROMPT -->
<details>
<summary>Original prompt</summary>
>
> ----
>
> *This section details on the original issue you should resolve*
>
> <issue_title>[Build] error: array index 4 is past the end of the array
(that has type '__m256[4]') [-Werror,-Warray-bounds]</issue_title>
> <issue_description>### Describe the issue
>
> Unable to build from main branch
(0768f42 as of time writing this issue)
on Intel Mac
>
> ```
> /usr/bin/c++ --version
> Apple clang version 17.0.0 (clang-1700.0.13.5)
> Target: x86_64-apple-darwin24.5.0
> Thread model: posix
> InstalledDir: /Library/Developer/CommandLineTools/usr/bin
> ```
>
>
> ### Urgency
>
> _No response_
>
> ### Target platform
>
> MacOS
>
> ### Build script
>
> ./build.sh --config RelWithDebInfo --build_shared_lib --parallel
--cmake_extra_defines CMAKE_OSX_ARCHITECTURES=x86_64
>
> ### Error / output
>
> [ 18%] Building CXX object
CMakeFiles/onnxruntime_mlas.dir/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2.cpp.o
> In file included from
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2.cpp:26:
>
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2_int8_blklen32.h:1677:49:
error: array index 4 is past the end of the array (that has type
'__m256[4]') [-Werror,-Warray-bounds]
> 1677 | __m128 acc_1 = FoldAccumulators(acc[4], acc[5], acc[6],
acc[7]);
> | ^ ~
>
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2_int8_blklen32.h:1531:13:
note: array 'acc' declared here
> 1531 | __m256 acc[NCols4];
> | ^
>
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2_int8_blklen32.h:1677:57:
error: array index 5 is past the end of the array (that has type
'__m256[4]') [-Werror,-Warray-bounds]
> 1677 | __m128 acc_1 = FoldAccumulators(acc[4], acc[5], acc[6],
acc[7]);
> | ^ ~
>
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2_int8_blklen32.h:1531:13:
note: array 'acc' declared here
> 1531 | __m256 acc[NCols4];
> | ^
>
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2_int8_blklen32.h:1677:65:
error: array index 6 is past the end of the array (that has type
'__m256[4]') [-Werror,-Warray-bounds]
> 1677 | __m128 acc_1 = FoldAccumulators(acc[4], acc[5], acc[6],
acc[7]);
> | ^ ~
>
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2_int8_blklen32.h:1531:13:
note: array 'acc' declared here
> 1531 | __m256 acc[NCols4];
> | ^
>
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2_int8_blklen32.h:1677:73:
error: array index 7 is past the end of the array (that has type
'__m256[4]') [-Werror,-Warray-bounds]
> 1677 | __m128 acc_1 = FoldAccumulators(acc[4], acc[5], acc[6],
acc[7]);
> | ^ ~
>
/onnxruntime/onnxruntime/core/mlas/lib/sqnbitgemm_kernel_avx2_int8_blklen32.h:1531:13:
note: array 'acc' declared here
> 1531 | __m256 acc[NCols4];
> | ^
> 4 errors generated.
>
> ### Visual Studio Version
>
> _No response_
>
> ### GCC / Compiler Version
>
> Apple clang version 17.0.0 (clang-1700.0.13.5)</issue_description>
>
> <agent_instructions>Please investigate the build error. If code need
fix, create a pull requests. Otherwise, suggest ways to avoid the build
errors.</agent_instructions>
>
> ## Comments on the Issue (you are @copilot in this section)
>
> <comments>
> </comments>
>
</details>
<!-- START COPILOT CODING AGENT SUFFIX -->
- Fixes #27497
<!-- START COPILOT CODING AGENT TIPS -->
---
🔒 GitHub Advanced Security automatically protects Copilot coding agent
pull requests. You can protect all pull requests by enabling Advanced
Security for your repositories. [Learn more about Advanced
Security.](https://gh.io/cca-advanced-security)
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
tianleiwu
added a commit
that referenced
this pull request
Mar 5, 2026
This cherry-picks the following commits for the release: | Commit ID | PR Number | Commit Title | |-----------|-----------|-------------| | d5387d8 | #27192 | Avoid repetitive creation of fp4/fp8 native-custom-op domains for NvTensorRtRtx EP | | 0b9906a | #27454 | Suppress spurious Array Out of Bounds warnings produced by GCC 14.2 compiler on Linux builds | | 4a80b0b | #27471 | Fix double-free in TRT EP custom op domain Release functions | | c7c939f | #27499 | Fix -Warray-bounds build error in MLAS on clang 17+ | | f99dcca | #27514 | [Build] Fix pybind11 vcpkg configuration | | ef04b10 | #27518 | [CXX Lora] Prevent heap OOB from maliciously crafted Lora Adapters. | | 0b2b6d0 | #27288 | [NvTensorRTRTX EP]: Add missing override specifiers to suppress warnings | | c1d8f5c | #27522 | Add "library_path" metadata entry to OrtEpDevice instances for plugin and provider bridge EPs | | fdead1c | #27537 | Account for ORT_NO_EXCEPTIONS builds in Lora test | | 3d1365e | #27521 | increase kMaxValueLength to 8192 | | df8f4a7 | #27535 | Add OrtEnv.DisableDllImportResolver to prevent fatal error on resolver conflict | | bdd672a | #27356 | Add/Update telemetry events | | 2da1a30 | #27543 | Fix RoiAlign heap out-of-bounds read via unchecked batch_indices | | 5c3f544 | #27466 | DQ→MatMulNBits fusion transformer for NvTensorRtRtx ep | --------- Co-authored-by: vishalpandya1990 <vipandya@nvidia.com> Co-authored-by: Vishnudas Thaniel S <vishnudas.thaniel.s@intel.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Stephan Seitz <sseitz@nvidia.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: xieofxie <xieofxie@126.com> Co-authored-by: hualxie <hualxie@microsoft.com> Co-authored-by: Darshak Bhatti <47045043+dabhattimsft@users.noreply.github.com> Co-authored-by: Darshak Bhatti <dabhatti@micorsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: anujj <ajalota@nvidia.com> Co-authored-by: praneshgo <227579474+praneshgo@users.noreply.github.com>
This was referenced Mar 9, 2026
This was referenced Mar 16, 2026
deps(nuget): Bump the microsoft-packages group with 8 updates
Ellerbach/azure-ai-search-simulator#76
Closed
This was referenced Mar 23, 2026
deps(nuget): Bump the microsoft-packages group with 8 updates
Ellerbach/azure-ai-search-simulator#80
Closed
This was referenced Mar 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The
-Warray-boundssuppression pragma insqnbitgemm_kernel_avx2_int8_blklen32.hwas gated ondefined(HAS_ARRAY_BOUNDS), which is set inonnxruntime_config.h. MLAS never includes that header, so the guard was dead code and the pragma never fired.Changed the guard to
#ifdef __clang__:Note:
__has_warning("-Warray-bounds")was considered but the C preprocessor does not short-circuit&&, so GCC fails to parse it even behinddefined(__clang__).Motivation and Context
Build fails on Intel Mac with Apple Clang 17.0.0 (
-Werror,-Warray-bounds). Clang raises a false-positive array-bounds warning onacc[4..7]inside anif constexpr (NCols4 == 8)branch that is dead whenNCols4 == 4.Original prompt
🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.