Add INT8, INT16, and UINT8 type support for CPU TopK operator by elwhyjay · Pull Request #27860 · microsoft/onnxruntime

elwhyjay · 2026-03-26T08:28:03Z

The ONNX specification (opset 11+) lists INT8, INT16, and UINT8 as valid input types for the TopK operator, but ONNX Runtime's CPU execution provider only registered float, double, int32, and int64 kernels. This causes a NOT_IMPLEMENTED error when running models that produce TopK nodes with these smaller integer types (e.g., PP-DocLayoutV2 exported via torch.onnx.export(dynamo=True)).

This commit adds kernel registrations and template specializations for int8_t, int16_t, and uint8_t in opset 11-23 and opset 24, along with unit tests covering largest, smallest, negative values, explicit axis, and opset 24 scenarios.

Description

Add kernel registrations and template specializations for int8_t, int16_t, and uint8_t types in the CPU TopK operator (opset 11-23 and opset 24).

Fixes #27859

Changed files:

onnxruntime/core/providers/cpu/math/top_k.cc — template specializations + kernel registrations
onnxruntime/core/providers/cpu/cpu_execution_provider.cc — forward declarations + BuildKernelCreateInfo entries
onnxruntime/test/providers/cpu/math/topk_op_test.cc — 8 new test cases (largest, smallest, negative values, explicit axis, opset 24)

Motivation and Context

The ONNX specification (opset 11+) lists INT8, INT16, and UINT8 as valid input types for the TopK operator. However, ONNX Runtime's CPU execution provider only had kernels registered for float, double, int32, and int64, causing a NOT_IMPLEMENTED error:

ONNXRuntimeError: Could not find an implementation for TopK(11) node with name '...'

This issue is encountered when running models like PP-DocLayoutV2 exported via torch.onnx.export(dynamo=True), which produces a Cast(bool → INT8) → TopK pattern.

The existing CPU TopK implementation is fully template-based, so no algorithmic changes were needed — only kernel registration and template instantiation for the missing types. All 64 TopK tests pass (including 8 new tests for the added types).

The ONNX specification (opset 11+) lists INT8, INT16, and UINT8 as valid input types for the TopK operator, but ONNX Runtime's CPU execution provider only registered float, double, int32, and int64 kernels. This causes a NOT_IMPLEMENTED error when running models that produce TopK nodes with these smaller integer types (e.g., PP-DocLayoutV2 exported via torch.onnx.export(dynamo=True)). This commit adds kernel registrations and template specializations for int8_t, int16_t, and uint8_t in opset 11-23 and opset 24, along with unit tests covering largest, smallest, negative values, explicit axis, and opset 24 scenarios.

elwhyjay · 2026-03-26T08:31:52Z

@microsoft-github-policy-service agree

Add type constraints and dispatch cases for int8_t, int16_t, and uint8_t in the CUDA TopK kernel (opset 1-9, 10, 11-23, 24), along with three new .cu template instantiation files. This is the CUDA counterpart to the CPU support added in microsoft#27860. Fixes microsoft#27859

elwhyjay · 2026-03-31T07:16:32Z

Hi @tianleiwu this is the CPU counterpart of #27862 — adds int8/int16/uint8 support for the CPU TopK kernel (opset 11+). Would appreciate a review when you get a chance. Thanks!

Copilot

Pull request overview

This PR extends the CPU execution provider’s TopK operator to support additional integer input types (int8_t, int16_t, uint8_t) that are valid per ONNX opset 11+, eliminating NOT_IMPLEMENTED failures for models that produce TopK nodes with these types.

Changes:

Add TopK template specializations and kernel registrations for int8_t, int16_t, and uint8_t for opset 11–23 and opset 24.
Register the new typed kernels with the CPU execution provider kernel registry.
Add unit tests covering the new types across largest/smallest, negative values, explicit axis, and opset 24 cases.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
`onnxruntime/core/providers/cpu/math/top_k.cc`	Adds opset 11–23 and opset 24 template specializations and kernel registrations for `int8_t`, `int16_t`, `uint8_t`.
`onnxruntime/core/providers/cpu/cpu_execution_provider.cc`	Adds forward declarations and `BuildKernelCreateInfo` entries so the CPU EP registers the new typed `TopK` kernels.
`onnxruntime/test/providers/cpu/math/topk_op_test.cc`	Adds new test coverage for `int8_t`, `int16_t`, and `uint8_t` inputs (including opset 24 cases).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tianleiwu · 2026-04-02T07:50:58Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-04-02T07:51:18Z

Azure Pipelines successfully started running 4 pipeline(s).

tianleiwu · 2026-04-07T00:07:50Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline, Web CI Pipeline

azure-pipelines · 2026-04-07T00:08:09Z

Azure Pipelines successfully started running 4 pipeline(s).

Add type constraints and dispatch cases for int8_t, int16_t, and uint8_t in the CUDA TopK kernel (opset 1-9, 10, 11-23, 24), along with three new .cu template instantiation files. This is the CUDA counterpart to the CPU support added in #27860. Fixes #27859 ### Description Add CUDA kernel type dispatch and template specializations for `int8_t`, `int16_t`, and `uint8_t` types in the CUDA TopK operator **Changed files:** - `onnxruntime/core/providers/cuda/math/topk.cc` — type constraints + dispatch cases for int8/int16/uint8 - `onnxruntime/core/providers/cuda/math/topk_impl_i8.cu` — **new** template instantiation for int8_t - `onnxruntime/core/providers/cuda/math/topk_impl_u8.cu` — **new** template instantiation for uint8_t - `onnxruntime/core/providers/cuda/math/topk_impl_i16.cu` — **new** template instantiation for int16_t ### Motivation and Context This is the CUDA counterpart to #27860 (CPU TopK INT8/INT16/UINT8 support). The [ONNX specification (opset 11+)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#TopK) lists `INT8`, `INT16`, and `UINT8` as valid input types for the TopK operator. After #27860 added CPU support, the CUDA execution provider still lacked kernels for these types, causing models to fall back to CPU or fail when using `CUDAExecutionProvider`. The existing CUDA TopK implementation uses a split-compilation pattern (one `.cu` file per type) with `ToCudaType<T>` mapping. Since the default template maps integer types to themselves and `NumericLimits<T>` uses `std::numeric_limits<T>`, no algorithmic changes were needed — only: 1. Adding type constraints to kernel registrations (all opset versions) 2. Adding dispatch cases in `ComputeInternal` 3. Creating three new `.cu` files for template instantiation All 64 TopK tests pass (including 8 tests for the new types, running on both CPU and CUDA providers). ### Test Results ``` [==========] Running 64 tests from 1 test suite. ... [ RUN ] TopKOperator.TopK_Int8 [ OK ] TopKOperator.TopK_Int8 (21 ms) [ RUN ] TopKOperator.TopK_Int8_Negative [ OK ] TopKOperator.TopK_Int8_Negative (21 ms) [ RUN ] TopKOperator.TopK_Int8_Smallest [ OK ] TopKOperator.TopK_Int8_Smallest (21 ms) [ RUN ] TopKOperator.TopK_Int16 [ OK ] TopKOperator.TopK_Int16 (21 ms) [ RUN ] TopKOperator.TopK_Uint8 [ OK ] TopKOperator.TopK_Uint8 (21 ms) [ RUN ] TopKOperator.TopK_Int8_ExplicitAxis [ OK ] TopKOperator.TopK_Int8_ExplicitAxis (21 ms) [ RUN ] TopKOperator.TopK_Int8_Opset24 [ OK ] TopKOperator.TopK_Int8_Opset24 (21 ms) [ RUN ] TopKOperator.TopK_Uint8_Opset24 [ OK ] TopKOperator.TopK_Uint8_Opset24 (21 ms) ... [ PASSED ] 64 tests. ```

elwhyjay mentioned this pull request Mar 26, 2026

Add INT8, INT16, and UINT8 type support for CUDA TopK operator #27862

Merged

Fix clang-format: remove extra alignment space in TopK test

4168072

tianleiwu requested a review from Copilot April 1, 2026 00:20

Copilot started reviewing on behalf of tianleiwu April 1, 2026 00:22 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

Comment thread onnxruntime/core/providers/cpu/math/top_k.cc

Fix comment to clarify opset range for TopK specializations (11-23)

af5948a

Update docs/OperatorKernels.md for CPU TopK int8/int16/uint8 support

3e65d87

elwhyjay force-pushed the add-topk-int8-int16-uint8-support branch from d81d5e4 to 3e65d87 Compare April 2, 2026 15:12

tianleiwu approved these changes Apr 8, 2026

View reviewed changes

tianleiwu merged commit 7fc826a into microsoft:main Apr 8, 2026
105 of 109 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add INT8, INT16, and UINT8 type support for CPU TopK operator#27860

Add INT8, INT16, and UINT8 type support for CPU TopK operator#27860
tianleiwu merged 4 commits intomicrosoft:mainfrom
elwhyjay:add-topk-int8-int16-uint8-support

elwhyjay commented Mar 26, 2026

Uh oh!

elwhyjay commented Mar 26, 2026

Uh oh!

elwhyjay commented Mar 31, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

tianleiwu commented Apr 2, 2026

Uh oh!

azure-pipelines Bot commented Apr 2, 2026

Uh oh!

tianleiwu commented Apr 7, 2026

Uh oh!

azure-pipelines Bot commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

elwhyjay commented Mar 26, 2026

Description

Motivation and Context

Uh oh!

elwhyjay commented Mar 26, 2026

Uh oh!

elwhyjay commented Mar 31, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

tianleiwu commented Apr 2, 2026

Uh oh!

azure-pipelines Bot commented Apr 2, 2026

Uh oh!

tianleiwu commented Apr 7, 2026

Uh oh!

azure-pipelines Bot commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants