MatmulNBits prepacking scales fix by hariharans29 · Pull Request #27412 · microsoft/onnxruntime

hariharans29 · 2026-02-21T02:40:27Z

Description

Fix incorrect scales element count while pre-packing scales while we processing the B input in the Prepack() method of MatmulNBits operator

Motivation and Context

Fix potential crash due to incorrect element count

…es processing the B input

Copilot

Pull request overview

This PR fixes a critical bug in the MatMulNBits operator's PrePack method for the MLFloat16 specialization. The bug occurs when prepacking the B input tensor (weights) and scales need to be converted from MLFloat16 to float32. The code was incorrectly using the B tensor's size instead of the scales tensor's size for buffer allocation and conversion, which could lead to buffer overruns or underruns depending on the relative sizes of B and scales tensors.

Changes:

Fixed incorrect size calculation when prepacking scales for MLFloat16 MatMulNBits operator
Changed from using B tensor size to scales tensor size for scales conversion buffer allocation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/test/contrib_ops/matmul_4bits_test.cc

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

### Description Fix incorrect scales element count while pre-packing scales while we processing the B input in the Prepack() method of MatmulNBits operator ### Motivation and Context Fix potential crash due to incorrect element count --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

This cherry-picks the following commits for the release: | Commit ID | PR Number | Commit Title | |-----------|-----------|-------------| | decd177 | #27090 | Fix GatherND division by zero when batch dimensions mismatch | | 55f8234 | #27360 | Fix QMoE CPU Operator | | df9146f | #27403 | [MLAS] Adding DynamicQGemm function pointers and ukernel interface | | 0f93853 | #27318 | [js/web] Use embedded WASM module in Blob URL workers when wasmBinary is provided | | b2a6e69 | #27364 | QMoE CPU Performance Update (Up to 4x on 4-bit) | | f501e1d | #27413 | Fix refcount bug in map input conversion that caused shutdown segfault | | b32b205 | #27421 | Fix error where bytes is not assigned for dynamic qgemm pack b size | | 426b006 | #27397 | Fix DllImportResolver | | 0982844 | #27412 | MatmulNBits prepacking scales fix | | 9afb0d2 | #27430 | Fix validation for external data paths for models loaded from bytes | | 71d2cd0 | #27401 | Enable Python 3.14 CI and Upgrade Dependencies | | 79e0676 | #27419 | fix: out of bounds access for resize operation | | 82eb99c | #27459 | Fix SkipLayerNorm fusion incorrectly applied when gamma/beta are not 1D | | 355278a | #27444 | Fix GatherCopyData Integer Truncation Leading to Heap Out-of-Bounds Read/Write | | cf96123 | #27411 | [web] fix usage of wasmBinary together with a blob URL for .mjs | | 1131a86 | #27399 | [web] remove the unhelpful "Unknown CPU vendor" warning. | | ffbbc4f | #27316 | Build Windows ARM64X binaries as part of packaging pipeline | --------- Signed-off-by: Jonathan Clohessy <Jonathan.Clohessy@arm.com> Co-authored-by: patryk-kaiser-ARM <patryk.kaiser@arm.com> Co-authored-by: don <70039285+0-don@users.noreply.github.com> Co-authored-by: Jonathan Clohessy <jonathan.clohessy@arm.com> Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com> Co-authored-by: Lukas Folle <126877803+lukas-folle-snkeos@users.noreply.github.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com> Co-authored-by: Chaya <cha182350@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Erik <erscor@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>

Fix incorrect scales element count calculation while pre-packing scal…

e579d41

…es processing the B input

hariharans29 requested a review from Copilot February 21, 2026 02:40

Copilot started reviewing on behalf of hariharans29 February 21, 2026 02:41 View session

Copilot AI reviewed Feb 21, 2026

View reviewed changes

Add test

34ce775

hariharans29 requested a review from Copilot February 21, 2026 02:56

Copilot started reviewing on behalf of hariharans29 February 21, 2026 02:57 View session

Copilot AI reviewed Feb 21, 2026

View reviewed changes

guschmue previously approved these changes Feb 23, 2026

View reviewed changes

tianleiwu previously approved these changes Feb 23, 2026

View reviewed changes

Add test

9d40b7d

hariharans29 dismissed stale reviews from tianleiwu and guschmue via 9d40b7d February 23, 2026 18:42

hariharans29 added 2 commits February 23, 2026 10:53

Add MatmulNBits unit test and remove shared lib test

51526a8

Add comment

c4e3e49

guschmue previously approved these changes Feb 23, 2026

View reviewed changes

tianleiwu previously approved these changes Feb 23, 2026

View reviewed changes

hariharans29 added the release:1.24.3 label Feb 23, 2026

Add test for fp16

6f5da27

hariharans29 dismissed stale reviews from tianleiwu and guschmue via 6f5da27 February 23, 2026 19:36

hariharans29 requested a review from Copilot February 23, 2026 19:39

Copilot started reviewing on behalf of hariharans29 February 23, 2026 19:41 View session

github-actions bot reviewed Feb 23, 2026

View reviewed changes

onnxruntime/test/contrib_ops/matmul_4bits_test.cc Show resolved Hide resolved

Copilot AI reviewed Feb 23, 2026

View reviewed changes

hariharans29 and others added 5 commits February 23, 2026 11:43

Update onnxruntime/test/contrib_ops/matmul_4bits_test.cc

c101131

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Experiment

972a403

More fixes

26114e7

Experiment - 2

8b9f0bb

Update tolerance

2559293

hariharans29 requested review from guschmue and tianleiwu February 24, 2026 00:08

hariharans29 enabled auto-merge (squash) February 24, 2026 00:08

guschmue approved these changes Feb 24, 2026

View reviewed changes

hariharans29 merged commit 0982844 into main Feb 24, 2026
90 checks passed

hariharans29 deleted the hari/scales_prepack branch February 24, 2026 16:17

tianleiwu mentioned this pull request Feb 27, 2026

ORT 1.24.3 release cherry pick round 1 #27476

Merged

tianleiwu removed the release:1.24.3 label Feb 28, 2026

This was referenced Mar 9, 2026

Bump Microsoft.ML.OnnxRuntime.Gpu from 1.23.2 to 1.24.3 yuniko-software/bge-m3-onnx#66

Closed

deps(nuget): Bump the microsoft-packages group with 2 updates Ellerbach/azure-ai-search-simulator#73

Closed

dependabot bot mentioned this pull request Mar 16, 2026

deps(nuget): Bump the microsoft-packages group with 8 updates Ellerbach/azure-ai-search-simulator#76

Closed

tianleiwu mentioned this pull request Mar 16, 2026

ORT 1.24.4 release cherry pick round 1 #27682

Merged

dependabot bot mentioned this pull request Apr 6, 2026

deps(nuget): Bump Microsoft.AspNetCore.Authentication.JwtBearer and 10 others Ellerbach/azure-ai-search-simulator#92

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MatmulNBits prepacking scales fix#27412

MatmulNBits prepacking scales fix#27412
hariharans29 merged 11 commits intomainfrom
hari/scales_prepack

hariharans29 commented Feb 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

hariharans29 commented Feb 21, 2026

Description

Motivation and Context

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants