[CUDA] RoiAlign for opset versions 16 and 22#27646
Conversation
There was a problem hiding this comment.
Pull request overview
Adds CUDA kernel support for RoiAlign across ONNX opset versions 16 and 22, including additional datatype coverage.
Changes:
- Register CUDA
RoiAlignkernels for opset ranges 10–15, 16–21, and opset 22. - Extend CUDA implementation to support
MLFloat16andBFloat16(with accumulation types). - Add CUDA tests covering Float16 (opset 16/22) and BFloat16 (opset 22), and update kernel documentation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| onnxruntime/test/providers/cpu/object_detection/roialign_test.cc | Adds CUDA EP tests for Float16/BFloat16 across opset 16 and 22. |
| onnxruntime/core/providers/cuda/object_detection/roialign_impl.cu | Improves numeric handling via accumulation types; enables half/BFloat16 specializations. |
| onnxruntime/core/providers/cuda/object_detection/roialign.cc | Adds versioned kernel registrations for opset 10–15, 16–21, and opset 22; enables MLFloat16/BFloat16 registrations. |
| onnxruntime/core/providers/cuda/cuda_execution_provider.cc | Adds kernel class declarations and registration entries for new RoiAlign variants/opsets. |
| docs/OperatorKernels.md | Updates documented opset/type support matrix for RoiAlign. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
There is also an issue in the header which is a pre-existing issue but might as well be fixed now. Default coordinate_transformation_mode for opset 16+: roialign.h:129-131 — The attribute is only read inside the if (GetAttr(...).IsOK()) guard. If the attribute is absent (which is valid per the spec), the if body is skipped and half_pixel_ stays at its roialign.h:143. But the ONNX spec for opset 16+ says the default for coordinate_transformation_mode is "half_pixel", which should set half_pixel_ = true. So when the attribute is omitted, the kernel silently behaves as "output_half_pixel" instead of the spec-mandated "half_pixel". Note: for opset 10 (which has no coordinate_transformation_mode attribute), the false default is correct — it matches the original opset 10 behavior. #Resolved |
There was a problem hiding this comment.
Pull request overview
Adds CUDA Execution Provider support for RoiAlign across multiple ONNX operator versions, extending datatype coverage to FP16/BF16 where supported by the corresponding opset versions.
Changes:
- Register CUDA
RoiAlignkernels for opset ranges 10–15 and 16–21, and opset 22 (including MLFloat16/BFloat16 where applicable). - Update CUDA
RoiAlignimplementation to accumulate using a wider accumulation type for FP16/BF16 paths. - Add CUDA-focused unit tests for opset 16 and 22 (FP16/BF16) and adjust default
coordinate_transformation_modebehavior for opset 16+.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
onnxruntime/test/providers/cpu/object_detection/roialign_test.cc |
Adds CUDA EP tests for opset 16/22 with FP16/BF16 and additional coordinate mode/sampling coverage. |
onnxruntime/core/providers/cuda/object_detection/roialign_impl.cu |
Uses AccumulationType_t for safer FP16/BF16 accumulation; adds explicit instantiations for half/BFloat16. |
onnxruntime/core/providers/cuda/object_detection/roialign.cc |
Updates CUDA kernel registrations to be versioned for opsets 10–15 and 16–21, plus opset 22 typed registrations. |
onnxruntime/core/providers/cuda/cuda_execution_provider.cc |
Wires new RoiAlign kernel registrations into CUDA EP kernel registry and class declarations. |
onnxruntime/core/providers/cpu/object_detection/roialign.h |
Sets default half_pixel_ behavior for opset 16+ when the attribute is absent, aligning with spec defaulting. |
docs/OperatorKernels.md |
Updates generated kernel support table entries for CUDA RoiAlign across opset ranges and types. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
roialign.h:129-131 — The attribute is only read inside the if (GetAttr(...).IsOK()) guard. If the attribute is absent (which is valid per the spec), the if body is skipped and half_pixel_ stays at its roialign.h:143. But the ONNX spec for opset 16+ says the default for coordinate_transformation_mode is "half_pixel", which should set half_pixel_ = true. So when the attribute is omitted, the kernel silently behaves as "output_half_pixel" instead of the spec-mandated "half_pixel". |
Current logic: |
Support RoiAlign for opset versions 16 and 22