Skip to content

Fill CUDA opset gap for ReduceMax and ReduceMin (18 → 20)#27755

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/update-onnx-reduce-operators
Draft

Fill CUDA opset gap for ReduceMax and ReduceMin (18 → 20)#27755
Copilot wants to merge 2 commits intomainfrom
copilot/update-onnx-reduce-operators

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 19, 2026

Description

Extends CUDA ReduceMax and ReduceMin kernel registrations from opset 18 to opset 20.

  • reduction_ops.cc: Added REGISTER_KERNEL_VERSIONED_RANGE_AXES_INPUT_TYPED macro for versioned ranges requiring InputMemoryType(OrtMemTypeCPUInput, 1). Split both operators from 2-way (1–17, 18+) to 3-way (1–17, 18–19, 20+).
  • cuda_execution_provider.cc: Capped opset 18 forward declarations and BuildKernelCreateInfo entries to versioned 18–19. Added opset 20 non-versioned entries for both operators.

Type coverage maintained as-is: ReduceMax (float, double, MLFloat16, int32_t, int64_t), ReduceMin adds int8_t, uint8_t.

Motivation and Context

ReduceMax and ReduceMin CUDA registrations stopped at opset 18; ONNX latest is opset 20. Models exported with opset 19–20 could fail to find a matching CUDA kernel for these ops.

Follows the same pattern used in #27735 (TopK) and other opset gap PRs tracked in #27729.

… 18 to opset 20

Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

Comment on lines +39 to +46
#define REGISTER_KERNEL_VERSIONED_RANGE_AXES_INPUT_TYPED(name, T, begin, end) \
ONNX_OPERATOR_VERSIONED_TYPED_KERNEL_EX( \
name, \
kOnnxDomain, \
begin, end, \
T, \
kCudaExecutionProvider, \
(*KernelDefBuilder::Create()).TypeConstraint("T", DataTypeImpl::GetTensorType<T>()).InputMemoryType(OrtMemTypeCPUInput, 1), \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define REGISTER_KERNEL_VERSIONED_RANGE_AXES_INPUT_TYPED(name, T, begin, end) \
ONNX_OPERATOR_VERSIONED_TYPED_KERNEL_EX( \
name, \
kOnnxDomain, \
begin, end, \
T, \
kCudaExecutionProvider, \
(*KernelDefBuilder::Create()).TypeConstraint("T", DataTypeImpl::GetTensorType<T>()).InputMemoryType(OrtMemTypeCPUInput, 1), \
#define REGISTER_KERNEL_VERSIONED_RANGE_AXES_INPUT_TYPED(name, T, begin, end) \
ONNX_OPERATOR_VERSIONED_TYPED_KERNEL_EX( \
name, \
kOnnxDomain, \
begin, end, \
T, \
kCudaExecutionProvider, \
(*KernelDefBuilder::Create()).TypeConstraint("T", DataTypeImpl::GetTensorType<T>()).InputMemoryType(OrtMemTypeCPUInput, 1), \

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants