Add INT8, INT16, and UINT8 type support for CPU TopK operator#27860
Add INT8, INT16, and UINT8 type support for CPU TopK operator#27860tianleiwu merged 4 commits intomicrosoft:mainfrom
Conversation
The ONNX specification (opset 11+) lists INT8, INT16, and UINT8 as valid input types for the TopK operator, but ONNX Runtime's CPU execution provider only registered float, double, int32, and int64 kernels. This causes a NOT_IMPLEMENTED error when running models that produce TopK nodes with these smaller integer types (e.g., PP-DocLayoutV2 exported via torch.onnx.export(dynamo=True)). This commit adds kernel registrations and template specializations for int8_t, int16_t, and uint8_t in opset 11-23 and opset 24, along with unit tests covering largest, smallest, negative values, explicit axis, and opset 24 scenarios.
|
@microsoft-github-policy-service agree |
Add type constraints and dispatch cases for int8_t, int16_t, and uint8_t in the CUDA TopK kernel (opset 1-9, 10, 11-23, 24), along with three new .cu template instantiation files. This is the CUDA counterpart to the CPU support added in microsoft#27860. Fixes microsoft#27859
|
Hi @tianleiwu this is the CPU counterpart of #27862 — adds int8/int16/uint8 support for the CPU TopK kernel (opset 11+). Would appreciate a review when you get a chance. Thanks! |
There was a problem hiding this comment.
Pull request overview
This PR extends the CPU execution provider’s TopK operator to support additional integer input types (int8_t, int16_t, uint8_t) that are valid per ONNX opset 11+, eliminating NOT_IMPLEMENTED failures for models that produce TopK nodes with these types.
Changes:
- Add
TopKtemplate specializations and kernel registrations forint8_t,int16_t, anduint8_tfor opset 11–23 and opset 24. - Register the new typed kernels with the CPU execution provider kernel registry.
- Add unit tests covering the new types across largest/smallest, negative values, explicit axis, and opset 24 cases.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
onnxruntime/core/providers/cpu/math/top_k.cc |
Adds opset 11–23 and opset 24 template specializations and kernel registrations for int8_t, int16_t, uint8_t. |
onnxruntime/core/providers/cpu/cpu_execution_provider.cc |
Adds forward declarations and BuildKernelCreateInfo entries so the CPU EP registers the new typed TopK kernels. |
onnxruntime/test/providers/cpu/math/topk_op_test.cc |
Adds new test coverage for int8_t, int16_t, and uint8_t inputs (including opset 24 cases). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
d81d5e4 to
3e65d87
Compare
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline, Web CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
Add type constraints and dispatch cases for int8_t, int16_t, and uint8_t in the CUDA TopK kernel (opset 1-9, 10, 11-23, 24), along with three new .cu template instantiation files. This is the CUDA counterpart to the CPU support added in #27860. Fixes #27859 ### Description Add CUDA kernel type dispatch and template specializations for `int8_t`, `int16_t`, and `uint8_t` types in the CUDA TopK operator **Changed files:** - `onnxruntime/core/providers/cuda/math/topk.cc` — type constraints + dispatch cases for int8/int16/uint8 - `onnxruntime/core/providers/cuda/math/topk_impl_i8.cu` — **new** template instantiation for int8_t - `onnxruntime/core/providers/cuda/math/topk_impl_u8.cu` — **new** template instantiation for uint8_t - `onnxruntime/core/providers/cuda/math/topk_impl_i16.cu` — **new** template instantiation for int16_t ### Motivation and Context This is the CUDA counterpart to #27860 (CPU TopK INT8/INT16/UINT8 support). The [ONNX specification (opset 11+)](https://github.com/onnx/onnx/blob/main/docs/Operators.md#TopK) lists `INT8`, `INT16`, and `UINT8` as valid input types for the TopK operator. After #27860 added CPU support, the CUDA execution provider still lacked kernels for these types, causing models to fall back to CPU or fail when using `CUDAExecutionProvider`. The existing CUDA TopK implementation uses a split-compilation pattern (one `.cu` file per type) with `ToCudaType<T>` mapping. Since the default template maps integer types to themselves and `NumericLimits<T>` uses `std::numeric_limits<T>`, no algorithmic changes were needed — only: 1. Adding type constraints to kernel registrations (all opset versions) 2. Adding dispatch cases in `ComputeInternal` 3. Creating three new `.cu` files for template instantiation All 64 TopK tests pass (including 8 tests for the new types, running on both CPU and CUDA providers). ### Test Results ``` [==========] Running 64 tests from 1 test suite. ... [ RUN ] TopKOperator.TopK_Int8 [ OK ] TopKOperator.TopK_Int8 (21 ms) [ RUN ] TopKOperator.TopK_Int8_Negative [ OK ] TopKOperator.TopK_Int8_Negative (21 ms) [ RUN ] TopKOperator.TopK_Int8_Smallest [ OK ] TopKOperator.TopK_Int8_Smallest (21 ms) [ RUN ] TopKOperator.TopK_Int16 [ OK ] TopKOperator.TopK_Int16 (21 ms) [ RUN ] TopKOperator.TopK_Uint8 [ OK ] TopKOperator.TopK_Uint8 (21 ms) [ RUN ] TopKOperator.TopK_Int8_ExplicitAxis [ OK ] TopKOperator.TopK_Int8_ExplicitAxis (21 ms) [ RUN ] TopKOperator.TopK_Int8_Opset24 [ OK ] TopKOperator.TopK_Int8_Opset24 (21 ms) [ RUN ] TopKOperator.TopK_Uint8_Opset24 [ OK ] TopKOperator.TopK_Uint8_Opset24 (21 ms) ... [ PASSED ] 64 tests. ```
The ONNX specification (opset 11+) lists INT8, INT16, and UINT8 as valid input types for the TopK operator, but ONNX Runtime's CPU execution provider only registered float, double, int32, and int64 kernels. This causes a NOT_IMPLEMENTED error when running models that produce TopK nodes with these smaller integer types (e.g., PP-DocLayoutV2 exported via torch.onnx.export(dynamo=True)).
This commit adds kernel registrations and template specializations for int8_t, int16_t, and uint8_t in opset 11-23 and opset 24, along with unit tests covering largest, smallest, negative values, explicit axis, and opset 24 scenarios.
Description
Add kernel registrations and template specializations for
int8_t,int16_t, anduint8_ttypes in the CPU TopK operator (opset 11-23 and opset 24).Fixes #27859
Changed files:
onnxruntime/core/providers/cpu/math/top_k.cc— template specializations + kernel registrationsonnxruntime/core/providers/cpu/cpu_execution_provider.cc— forward declarations + BuildKernelCreateInfo entriesonnxruntime/test/providers/cpu/math/topk_op_test.cc— 8 new test cases (largest, smallest, negative values, explicit axis, opset 24)Motivation and Context
The ONNX specification (opset 11+) lists
INT8,INT16, andUINT8as valid input types for the TopK operator. However, ONNX Runtime's CPU execution provider only had kernels registered forfloat,double,int32, andint64, causing aNOT_IMPLEMENTEDerror:This issue is encountered when running models like PP-DocLayoutV2 exported via
torch.onnx.export(dynamo=True), which produces aCast(bool → INT8) → TopKpattern.The existing CPU TopK implementation is fully template-based, so no algorithmic changes were needed — only kernel registration and template instantiation for the missing types. All 64 TopK tests pass (including 8 new tests for the added types).