-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[EP ABI] Initial support for kernel-based EPs #26206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This draft PR implements support for kernel-based execution providers (EPs) within the ONNX Runtime EP plugin architecture. The changes enable plugin EPs to register custom kernels directly with the ORT runtime, expanding beyond the current node-based computation model.
- Adds comprehensive kernel registration infrastructure for plugin EPs
- Implements memory copy kernels as examples (MemcpyFromHost/MemcpyToHost)
- Extends the EP API with kernel definition and creation functionality
Reviewed Changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
onnxruntime/test/framework/ep_plugin_provider_test.cc |
Updates test to pass kernel registry parameter |
onnxruntime/test/autoep/library/kernels/utils.h |
Defines kernel creation utilities and macros |
onnxruntime/test/autoep/library/kernels/memcpy.h |
Declares example Memcpy kernel interface |
onnxruntime/test/autoep/library/kernels/memcpy.cc |
Implements example Memcpy kernel with registration |
onnxruntime/test/autoep/library/kernels/data_types.h |
Declares MLDataTypes singleton for type management |
onnxruntime/test/autoep/library/kernels/data_types.cc |
Implements MLDataTypes for tensor type retrieval |
onnxruntime/test/autoep/library/ep_kernel_registration.h |
Declares kernel registration functions |
onnxruntime/test/autoep/library/ep_kernel_registration.cc |
Implements kernel registration logic |
onnxruntime/test/autoep/library/ep.h |
Adds kernel creation method declarations to EP |
onnxruntime/test/autoep/library/ep.cc |
Implements kernel creation methods in example EP |
onnxruntime/core/session/utils.h |
Declares CopyTensors utility function |
onnxruntime/core/session/utils.cc |
Implements CopyTensors utility function |
onnxruntime/core/session/provider_policy_context.cc |
Updates EP creation to use new factory method |
onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.h |
Extends PluginExecutionProvider with kernel registry support |
onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.cc |
Implements kernel registry initialization in plugin EP |
onnxruntime/core/session/plugin_ep/ep_kernel_registration.h |
Declares kernel registration infrastructure |
onnxruntime/core/session/plugin_ep/ep_kernel_registration.cc |
Implements plugin EP kernel wrapper and registration |
onnxruntime/core/session/plugin_ep/ep_api.h |
Declares new EP API functions for kernel support |
onnxruntime/core/session/plugin_ep/ep_api.cc |
Implements new EP API functions for kernel support |
onnxruntime/core/session/onnxruntime_c_api.cc |
Refactors CopyTensors to use shared utility |
include/onnxruntime/core/session/onnxruntime_ep_c_api.h |
Adds kernel-related types and API declarations |
include/onnxruntime/core/session/onnxruntime_cxx_inline.h |
Implements C++ wrapper methods for kernel APIs |
include/onnxruntime/core/session/onnxruntime_cxx_api.h |
Declares C++ KernelDefBuilder class |
cmake/onnxruntime_unittests.cmake |
Updates build to include kernel source files |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
onnxruntime/test/autoep/library/example_plugin_ep/kernels/memcpy.cc
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep/kernels/memcpy.cc
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/kernels/data_types.h
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/kernels/data_types.h
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/kernels/mul.h
Outdated
Show resolved
Hide resolved
edgchen1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, just some minor comments on the example EP code
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/kernels/base.h
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/kernels/mul.cc
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_kernel_registration.cc
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep.cc
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/kernels/base.h
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/kernels/utils.h
Outdated
Show resolved
Hide resolved
onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.cc
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_kernel_registration.cc
Outdated
Show resolved
Hide resolved
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_kernel_registration.cc
Outdated
Show resolved
Hide resolved
…es.cc Co-authored-by: Edward Chen <[email protected]>
…try/ep_kernel_registration.cc Co-authored-by: Edward Chen <[email protected]>
### Description
This PR adds an initial set of C APIs necessary to support kernel
registration for plugin EPs.
### Example use
The example plugin EP implementation now registers `MemcpyFromHost` and
`MemcpyToHost` operator kernels using the new APIs. New utilities in the
example implementation make the process of defining operator kernels
very similar to the existing process used by provider-bridge EPs.
First, the operator kernel class is defined:
```c++
// File: onnxruntime/test/autoep/library/kernels/memcpy.h
struct Memcpy : public OrtKernelImpl {
static OrtStatus* Create(const OrtKernelInfo* info, void* state, /*out*/ std::unique_ptr<Memcpy>& kernel);
Memcpy(const OrtKernelInfo* info, void* state);
static OrtStatus* ORT_API_CALL ComputeImpl(OrtKernelImpl* this_ptr, OrtKernelContext* kernel_ctx) noexcept;
static void ORT_API_CALL ReleaseImpl(OrtKernelImpl* this_ptr) noexcept;
OrtStatus* DoCompute(OrtKernelContext* kernel_ctx) noexcept;
private:
const OrtKernelInfo* info_;
void* state_; // Custom state passed from OrtEp
};
```
Then, a macro defines a function that can be called to register the
operator with the EP's kernel registry:
```c++
// File: onnxruntime/test/autoep/library/kernels/memcpy.cc
ONNX_OPERATOR_KERNEL_EX(
MemcpyFromHost,
kOnnxDomain,
1,
(Ort::KernelDefBuilder()
.SetInputMemType(0, OrtMemType::OrtMemTypeCPUInput)
.AddTypeConstraint("T", MLDataTypes::GetTensorType(ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT))),
Memcpy)
ONNX_OPERATOR_KERNEL_EX(
MemcpyToHost,
kOnnxDomain,
1,
(Ort::KernelDefBuilder()
.SetOutputMemType(0, OrtMemType::OrtMemTypeCPUOutput)
.AddTypeConstraint("T", MLDataTypes::GetTensorType(ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT))),
Memcpy)
```
Lastly, the functions defined by the above macro are entered into a
table:
```c++
// File: onnxruntime/test/autoep/library/ep_kernel_registration.cc
// Include kernel files:
#include "kernels/memcpy.h"
// Forward declarations of kernel classes used as template args for BuildKernelCreateInfo
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kOnnxDomain, 1, MemcpyFromHost);
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kOnnxDomain, 1, MemcpyToHost);
// Table of BuildKernelCreateInfo functions for each operator
static const BuildKernelCreateInfoFn build_kernel_create_info_funcs[] = {
BuildKernelCreateInfo<void>, // Dummy to avoid table becoming empty.
BuildKernelCreateInfo<ONNX_OPERATOR_KERNEL_CLASS_NAME(kOnnxDomain, 1, MemcpyFromHost)>,
BuildKernelCreateInfo<ONNX_OPERATOR_KERNEL_CLASS_NAME(kOnnxDomain, 1, MemcpyToHost)>,
};
```
The [example EP processes the entries in the above
table](https://github.com/microsoft/onnxruntime/blob/adrianl/ep-abi-kernel-based-eps/onnxruntime/test/autoep/library/ep_kernel_registration.cc)
to add information about the supported operator kernels to the EP's
kernel registry (`OrtKernelRegistry`).
Additionally, during the call to `OrtEp::GetCapability`, an EP can now
lookup registered kernel definitions via the new API
`EpGraphSupportInfo_LookUpKernel`. Note that an EP would not normally
lookup kernels for `Memcpy**Host`, which are inserted by ORT. Instead,
it would be used to look up other registered operator kernels like
`Conv`, for example.
```c++
static OrtStatus* ORT_API_CALL GetCapabilityImpl(OrtEp* this_ptr, const OrtGraph* graph,
OrtEpGraphSupportInfo* graph_support_info) noexcept {
// ...
for (const OrtNode* node : nodes) {
const OrtKernelDef* kernel_def = nullptr;
OrtStatus* status = this_ep->ep_api->EpGraphSupportInfo_LookUpKernel(graph_support_info, node, &kernel_def);
if (status != nullptr) {
return status;
}
if (kernel_def != nullptr) { // Take node if this EP has a registered kernel for it.
if (OrtStatus* st = this_ep->ep_api->EpGraphSupportInfo_AddSingleNode(graph_support_info, node);
st != nullptr) {
return st;
}
}
}
return nullptr;
}
```
### EP implementation details
An EP instance (i.e., `OrtEp`) that needs to register operator kernels
with ONNX Runtime must implement the following
`OrtEp::GetKernelRegistry()` function:
| Function Signature | Description |
|--------------------|-------------|
|**GetKernelRegistry**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtEp*
this_ptr`: The OrtEp instance.</li><li>`const OrtKernelRegistry**
kernel_registry`: Output parameter set to the EP's kernel registry,
which must remain valid throughout the lifetime of the EP.</li></ul>|
Gets the execution provider's kernel registry, if
any.<br/><br/>**Remarks:** A kernel registry contains kernel creation
information for operator kernels supported by an EP.<br/><br/>**Note:**
Implementation of this function is optional. If set to NULL, ORT assumes
the EP compiles nodes. |
If defined by the EP, the `OrtEp::GetKernelRegistry()` function is
[called by ONNX
Runtime](https://github.com/microsoft/onnxruntime/blob/0f7145f3809103c123de2d281a6b310677e6d56c/onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.cc#L146-L147)
after creating an instance of the `OrtEp` in order to retrieve the EP's
kernel registry.
#### APIs used by EP to add entries to kernel registry
An EP's kernel registry (`OrtKernelRegistry`) contains **information**
necessary for the (later) creation of operator kernels supported by an
EP. Conceptually, a kernel registry contains an array of "kernel
creation information" elements, one per operator. Each such element
consists of:
- A kernel **definition** (`OrtKernelDef`), which specifies operator
type, supported versions, type constraints, I/O memory types, etc.
- A function of type `OrtKernelCreateFunc` that ORT calls to create an
instance of the kernel (`OrtKernelImpl`).
- Custom opaque state (provided by the `OrtEp`) that is passed to the
`OrtKernelCreateFunc`.
An EP uses the following `OrtEpApi::KernelRegistry_AddKernel()` function
to add an entry for one supported operator.
| Function Signature | Description |
|--------------------|-------------|
|**KernelRegistry_AddKernel**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtKernelRegistry*
kernel_registry`: The OrtKernelRegistry instance.</li><li>`const
OrtKernelDef* kernel_def`: The kernel definition, which includes
operator type, version, EP name, type constraints,
etc.</li><li>`OrtKernelCreateFunc kernel_create_func`: Function that
creates an instance of the operator kernel as a OrtKernelImpl
instance.</li><li>`void* kernel_create_func_state`: Custom state passed
to the kernel creation function. Can be null.</li></ul>| Adds kernel
creation information for a supported operator kernel to the given kernel
registry.<br/><br/>**Remarks:** Refer to OrtEp::GetKernelRegistry, which
returns an EP's kernel registry to ORT. |
##### Building a kernel definition
An EP uses a kernel definition builder (`OrtKernelDefBuilder`) to create
a kernel definition (`OrtKernelDef`). The following table lists **some**
of the C APIs related to building a kernel definition. The above
`ONNX_OPERATOR_KERNEL_EX` macro [uses these
APIs](https://github.com/microsoft/onnxruntime/blob/adrianl/ep-abi-kernel-based-eps/onnxruntime/test/autoep/library/kernels/utils.h#L42).
| Function Signature | Description |
|--------------------|-------------|
|**KernelDefBuilder_SetOperatorType**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtKernelDefBuilder*
kernel_def_builder`: The OrtKernelDefBuilder instance.</li><li>`const
char* op_type`: A null-terminated string representing the operator
type.</li></ul>| Sets the kernel's operator type. |
|**KernelDefBuilder_SetDomain**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtKernelDefBuilder*
kernel_def_builder`: The OrtKernelDefBuilder instance.</li><li>`const
char* domain`: A null-terminated string representing the operator's
domain.</li></ul>| Sets the kernel's domain. |
| ... | ... |
|**KernelDefBuilder_Build**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtKernelDefBuilder*
kernel_def_builder`: The OrtKernelDefBuilder
instance.</li><li>`OrtKernelDef** kernel_def_out`: The new OrtKernelDef
instance.</li></ul>| Creates a OrtKernelDef instance from the given
kernel definition builder. |
##### Defining a kernel implementation
An EP defines a kernel implementation by initializing an instance of
`OrtKernelImpl` (shown below) with function pointers for computation,
release, etc.
```c++
struct OrtKernelImpl {
uint32_t ort_version_supported; ///< Must be initialized to ORT_API_VERSION
/** \brief Computation function called to execute the kernel on an EP.
*
* \param[in] this_ptr The OrtKernelImpl instance.
* \param[in] context The OrtKernelContext instance that provides access to the inputs and outputs.
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.24.
*/
ORT_API2_STATUS(Compute, _In_ OrtKernelImpl* this_ptr, _In_ OrtKernelContext* context);
/** \brief Called by ORT to release the OrtKernelImpl instance and its resources.
*
* \param[in] this_ptr The OrtKernelImpl instance.
*
* \since Version 1.24.
*/
ORT_API_T(void, Release, _In_ OrtKernelImpl* this_ptr);
};
```
As shown previously, the example EP creates a `Memcpy` class that
inherits from `OrtKernelImpl` and [implements the above
functions](https://github.com/microsoft/onnxruntime/blob/adrianl/ep-abi-kernel-based-eps/onnxruntime/test/autoep/library/kernels/memcpy.cc).
##### Defining a kernel creation function
An EP must provide a function of type `OrtKernelCreateFunc` that ORT can
later call to create an instance of a kernel (`OrtKernelImpl`). The
signature of the `OrtKernelCreateFunc` is shown below.
```c++
/** \brief Type definition for a function that creates an OrtKernelImpl instance for an operator kernel.
*
* \param[in] ctx Unused/reserved for future use.
* \param[in] kernel_create_func_state Opaque state initially provided by the EP that registered the kernel.
* Refer to OrtEpApi::KernelRegistry_AddKernel(). May be null.
* \param[in] info The OrtKernelInfo instance that provides access to the kernel's input and output characteristics.
* \param[out] kernel_out Output parameter set to the new OrtKernelImpl instance.
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.24.
*/
typedef OrtStatus*(ORT_API_CALL* OrtKernelCreateFunc)(_In_ OrtKernelCreateContext* ctx, // unused/reserved as of 1.24
_In_ void* kernel_create_func_state,
_In_ const OrtKernelInfo* info,
_Outptr_result_maybenull_ OrtKernelImpl** kernel_out);
```
The example EP declares kernel creation functions via use of the
previously mentioned `ONNX_OPERATOR_KERNEL_EX`
[macro](https://github.com/microsoft/onnxruntime/blob/adrianl/ep-abi-kernel-based-eps/onnxruntime/test/autoep/library/kernels/utils.h#L56-L64).
If one were to expand the macro call, the kernel creation function for
`MemcpyFromHost` would look similar to the following snippet:
```c++
OrtStatus* ORT_API_CALL CreateMemcpyKernel(OrtKernelCreateContext* /*ctx*/, void* kernel_create_func_state,
const OrtKernelInfo* info, OrtKernelImpl** kernel_out) {
*kernel_out = nullptr;
std::unique_ptr<Memcpy> kernel;
RETURN_IF_ERROR(Memcpy::Create(info, kernel_create_func_state, kernel));
*kernel_out = kernel.release();
return nullptr;
}
```
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
…26754) ### Description - Adds C APIs to support pre-packing of const weights for `OrtKernelImpl` implementations. - APIs optionally support sharing of pre-packed weight data (for cpu-accessible memory). - Updates example kernel (Mul) to use new pre-packing API. Tested by existing unit test: https://github.com/microsoft/onnxruntime/blob/549d7415e26e2b3f86c42f86e135bb746caa37b4/onnxruntime/test/autoep/test_execution.cc#L242-L256 ### Motivation and Context The [previous PR](#26206) added the base APIs that support kernel-based plugin EPs. This PR adds an additional feature that was identified as necessary for the port of WebGPU EP. --------- Co-authored-by: Edward Chen <[email protected]>
### Description
This PR adds an initial set of C APIs necessary to support kernel
registration for plugin EPs.
### Example use
The example plugin EP implementation now registers `MemcpyFromHost` and
`MemcpyToHost` operator kernels using the new APIs. New utilities in the
example implementation make the process of defining operator kernels
very similar to the existing process used by provider-bridge EPs.
First, the operator kernel class is defined:
```c++
// File: onnxruntime/test/autoep/library/kernels/memcpy.h
struct Memcpy : public OrtKernelImpl {
static OrtStatus* Create(const OrtKernelInfo* info, void* state, /*out*/ std::unique_ptr<Memcpy>& kernel);
Memcpy(const OrtKernelInfo* info, void* state);
static OrtStatus* ORT_API_CALL ComputeImpl(OrtKernelImpl* this_ptr, OrtKernelContext* kernel_ctx) noexcept;
static void ORT_API_CALL ReleaseImpl(OrtKernelImpl* this_ptr) noexcept;
OrtStatus* DoCompute(OrtKernelContext* kernel_ctx) noexcept;
private:
const OrtKernelInfo* info_;
void* state_; // Custom state passed from OrtEp
};
```
Then, a macro defines a function that can be called to register the
operator with the EP's kernel registry:
```c++
// File: onnxruntime/test/autoep/library/kernels/memcpy.cc
ONNX_OPERATOR_KERNEL_EX(
MemcpyFromHost,
kOnnxDomain,
1,
(Ort::KernelDefBuilder()
.SetInputMemType(0, OrtMemType::OrtMemTypeCPUInput)
.AddTypeConstraint("T", MLDataTypes::GetTensorType(ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT))),
Memcpy)
ONNX_OPERATOR_KERNEL_EX(
MemcpyToHost,
kOnnxDomain,
1,
(Ort::KernelDefBuilder()
.SetOutputMemType(0, OrtMemType::OrtMemTypeCPUOutput)
.AddTypeConstraint("T", MLDataTypes::GetTensorType(ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT))),
Memcpy)
```
Lastly, the functions defined by the above macro are entered into a
table:
```c++
// File: onnxruntime/test/autoep/library/ep_kernel_registration.cc
// Include kernel files:
#include "kernels/memcpy.h"
// Forward declarations of kernel classes used as template args for BuildKernelCreateInfo
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kOnnxDomain, 1, MemcpyFromHost);
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kOnnxDomain, 1, MemcpyToHost);
// Table of BuildKernelCreateInfo functions for each operator
static const BuildKernelCreateInfoFn build_kernel_create_info_funcs[] = {
BuildKernelCreateInfo<void>, // Dummy to avoid table becoming empty.
BuildKernelCreateInfo<ONNX_OPERATOR_KERNEL_CLASS_NAME(kOnnxDomain, 1, MemcpyFromHost)>,
BuildKernelCreateInfo<ONNX_OPERATOR_KERNEL_CLASS_NAME(kOnnxDomain, 1, MemcpyToHost)>,
};
```
The [example EP processes the entries in the above
table](https://github.com/microsoft/onnxruntime/blob/adrianl/ep-abi-kernel-based-eps/onnxruntime/test/autoep/library/ep_kernel_registration.cc)
to add information about the supported operator kernels to the EP's
kernel registry (`OrtKernelRegistry`).
Additionally, during the call to `OrtEp::GetCapability`, an EP can now
lookup registered kernel definitions via the new API
`EpGraphSupportInfo_LookUpKernel`. Note that an EP would not normally
lookup kernels for `Memcpy**Host`, which are inserted by ORT. Instead,
it would be used to look up other registered operator kernels like
`Conv`, for example.
```c++
static OrtStatus* ORT_API_CALL GetCapabilityImpl(OrtEp* this_ptr, const OrtGraph* graph,
OrtEpGraphSupportInfo* graph_support_info) noexcept {
// ...
for (const OrtNode* node : nodes) {
const OrtKernelDef* kernel_def = nullptr;
OrtStatus* status = this_ep->ep_api->EpGraphSupportInfo_LookUpKernel(graph_support_info, node, &kernel_def);
if (status != nullptr) {
return status;
}
if (kernel_def != nullptr) { // Take node if this EP has a registered kernel for it.
if (OrtStatus* st = this_ep->ep_api->EpGraphSupportInfo_AddSingleNode(graph_support_info, node);
st != nullptr) {
return st;
}
}
}
return nullptr;
}
```
### EP implementation details
An EP instance (i.e., `OrtEp`) that needs to register operator kernels
with ONNX Runtime must implement the following
`OrtEp::GetKernelRegistry()` function:
| Function Signature | Description |
|--------------------|-------------|
|**GetKernelRegistry**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtEp*
this_ptr`: The OrtEp instance.</li><li>`const OrtKernelRegistry**
kernel_registry`: Output parameter set to the EP's kernel registry,
which must remain valid throughout the lifetime of the EP.</li></ul>|
Gets the execution provider's kernel registry, if
any.<br/><br/>**Remarks:** A kernel registry contains kernel creation
information for operator kernels supported by an EP.<br/><br/>**Note:**
Implementation of this function is optional. If set to NULL, ORT assumes
the EP compiles nodes. |
If defined by the EP, the `OrtEp::GetKernelRegistry()` function is
[called by ONNX
Runtime](https://github.com/microsoft/onnxruntime/blob/0f7145f3809103c123de2d281a6b310677e6d56c/onnxruntime/core/session/plugin_ep/ep_plugin_provider_interfaces.cc#L146-L147)
after creating an instance of the `OrtEp` in order to retrieve the EP's
kernel registry.
#### APIs used by EP to add entries to kernel registry
An EP's kernel registry (`OrtKernelRegistry`) contains **information**
necessary for the (later) creation of operator kernels supported by an
EP. Conceptually, a kernel registry contains an array of "kernel
creation information" elements, one per operator. Each such element
consists of:
- A kernel **definition** (`OrtKernelDef`), which specifies operator
type, supported versions, type constraints, I/O memory types, etc.
- A function of type `OrtKernelCreateFunc` that ORT calls to create an
instance of the kernel (`OrtKernelImpl`).
- Custom opaque state (provided by the `OrtEp`) that is passed to the
`OrtKernelCreateFunc`.
An EP uses the following `OrtEpApi::KernelRegistry_AddKernel()` function
to add an entry for one supported operator.
| Function Signature | Description |
|--------------------|-------------|
|**KernelRegistry_AddKernel**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtKernelRegistry*
kernel_registry`: The OrtKernelRegistry instance.</li><li>`const
OrtKernelDef* kernel_def`: The kernel definition, which includes
operator type, version, EP name, type constraints,
etc.</li><li>`OrtKernelCreateFunc kernel_create_func`: Function that
creates an instance of the operator kernel as a OrtKernelImpl
instance.</li><li>`void* kernel_create_func_state`: Custom state passed
to the kernel creation function. Can be null.</li></ul>| Adds kernel
creation information for a supported operator kernel to the given kernel
registry.<br/><br/>**Remarks:** Refer to OrtEp::GetKernelRegistry, which
returns an EP's kernel registry to ORT. |
##### Building a kernel definition
An EP uses a kernel definition builder (`OrtKernelDefBuilder`) to create
a kernel definition (`OrtKernelDef`). The following table lists **some**
of the C APIs related to building a kernel definition. The above
`ONNX_OPERATOR_KERNEL_EX` macro [uses these
APIs](https://github.com/microsoft/onnxruntime/blob/adrianl/ep-abi-kernel-based-eps/onnxruntime/test/autoep/library/kernels/utils.h#L42).
| Function Signature | Description |
|--------------------|-------------|
|**KernelDefBuilder_SetOperatorType**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtKernelDefBuilder*
kernel_def_builder`: The OrtKernelDefBuilder instance.</li><li>`const
char* op_type`: A null-terminated string representing the operator
type.</li></ul>| Sets the kernel's operator type. |
|**KernelDefBuilder_SetDomain**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtKernelDefBuilder*
kernel_def_builder`: The OrtKernelDefBuilder instance.</li><li>`const
char* domain`: A null-terminated string representing the operator's
domain.</li></ul>| Sets the kernel's domain. |
| ... | ... |
|**KernelDefBuilder_Build**<br/><br/>**Returns**:`OrtStatus*`<br/><br/>**Parameters:**<br/><ul><li>`OrtKernelDefBuilder*
kernel_def_builder`: The OrtKernelDefBuilder
instance.</li><li>`OrtKernelDef** kernel_def_out`: The new OrtKernelDef
instance.</li></ul>| Creates a OrtKernelDef instance from the given
kernel definition builder. |
##### Defining a kernel implementation
An EP defines a kernel implementation by initializing an instance of
`OrtKernelImpl` (shown below) with function pointers for computation,
release, etc.
```c++
struct OrtKernelImpl {
uint32_t ort_version_supported; ///< Must be initialized to ORT_API_VERSION
/** \brief Computation function called to execute the kernel on an EP.
*
* \param[in] this_ptr The OrtKernelImpl instance.
* \param[in] context The OrtKernelContext instance that provides access to the inputs and outputs.
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.24.
*/
ORT_API2_STATUS(Compute, _In_ OrtKernelImpl* this_ptr, _In_ OrtKernelContext* context);
/** \brief Called by ORT to release the OrtKernelImpl instance and its resources.
*
* \param[in] this_ptr The OrtKernelImpl instance.
*
* \since Version 1.24.
*/
ORT_API_T(void, Release, _In_ OrtKernelImpl* this_ptr);
};
```
As shown previously, the example EP creates a `Memcpy` class that
inherits from `OrtKernelImpl` and [implements the above
functions](https://github.com/microsoft/onnxruntime/blob/adrianl/ep-abi-kernel-based-eps/onnxruntime/test/autoep/library/kernels/memcpy.cc).
##### Defining a kernel creation function
An EP must provide a function of type `OrtKernelCreateFunc` that ORT can
later call to create an instance of a kernel (`OrtKernelImpl`). The
signature of the `OrtKernelCreateFunc` is shown below.
```c++
/** \brief Type definition for a function that creates an OrtKernelImpl instance for an operator kernel.
*
* \param[in] ctx Unused/reserved for future use.
* \param[in] kernel_create_func_state Opaque state initially provided by the EP that registered the kernel.
* Refer to OrtEpApi::KernelRegistry_AddKernel(). May be null.
* \param[in] info The OrtKernelInfo instance that provides access to the kernel's input and output characteristics.
* \param[out] kernel_out Output parameter set to the new OrtKernelImpl instance.
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.24.
*/
typedef OrtStatus*(ORT_API_CALL* OrtKernelCreateFunc)(_In_ OrtKernelCreateContext* ctx, // unused/reserved as of 1.24
_In_ void* kernel_create_func_state,
_In_ const OrtKernelInfo* info,
_Outptr_result_maybenull_ OrtKernelImpl** kernel_out);
```
The example EP declares kernel creation functions via use of the
previously mentioned `ONNX_OPERATOR_KERNEL_EX`
[macro](https://github.com/microsoft/onnxruntime/blob/adrianl/ep-abi-kernel-based-eps/onnxruntime/test/autoep/library/kernels/utils.h#L56-L64).
If one were to expand the macro call, the kernel creation function for
`MemcpyFromHost` would look similar to the following snippet:
```c++
OrtStatus* ORT_API_CALL CreateMemcpyKernel(OrtKernelCreateContext* /*ctx*/, void* kernel_create_func_state,
const OrtKernelInfo* info, OrtKernelImpl** kernel_out) {
*kernel_out = nullptr;
std::unique_ptr<Memcpy> kernel;
RETURN_IF_ERROR(Memcpy::Create(info, kernel_create_func_state, kernel));
*kernel_out = kernel.release();
return nullptr;
}
```
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
…icrosoft#26754) ### Description - Adds C APIs to support pre-packing of const weights for `OrtKernelImpl` implementations. - APIs optionally support sharing of pre-packed weight data (for cpu-accessible memory). - Updates example kernel (Mul) to use new pre-packing API. Tested by existing unit test: https://github.com/microsoft/onnxruntime/blob/549d7415e26e2b3f86c42f86e135bb746caa37b4/onnxruntime/test/autoep/test_execution.cc#L242-L256 ### Motivation and Context The [previous PR](microsoft#26206) added the base APIs that support kernel-based plugin EPs. This PR adds an additional feature that was identified as necessary for the port of WebGPU EP. --------- Co-authored-by: Edward Chen <[email protected]>
…icrosoft#26754) - Adds C APIs to support pre-packing of const weights for `OrtKernelImpl` implementations. - APIs optionally support sharing of pre-packed weight data (for cpu-accessible memory). - Updates example kernel (Mul) to use new pre-packing API. Tested by existing unit test: https://github.com/microsoft/onnxruntime/blob/549d7415e26e2b3f86c42f86e135bb746caa37b4/onnxruntime/test/autoep/test_execution.cc#L242-L256 The [previous PR](microsoft#26206) added the base APIs that support kernel-based plugin EPs. This PR adds an additional feature that was identified as necessary for the port of WebGPU EP. --------- Co-authored-by: Edward Chen <[email protected]>
Description
This PR adds an initial set of C APIs necessary to support kernel registration for plugin EPs.
Example use
The example plugin EP implementation now registers
MemcpyFromHostandMemcpyToHostoperator kernels using the new APIs. New utilities in the example implementation make the process of defining operator kernels very similar to the existing process used by provider-bridge EPs.First, the operator kernel class is defined:
Then, a macro defines a function that can be called to register the operator with the EP's kernel registry:
Lastly, the functions defined by the above macro are entered into a table:
The example EP processes the entries in the above table to add information about the supported operator kernels to the EP's kernel registry (
OrtKernelRegistry).Additionally, during the call to
OrtEp::GetCapability, an EP can now lookup registered kernel definitions via the new APIEpGraphSupportInfo_LookUpKernel. Note that an EP would not normally lookup kernels forMemcpy**Host, which are inserted by ORT. Instead, it would be used to look up other registered operator kernels likeConv, for example.EP implementation details
An EP instance (i.e.,
OrtEp) that needs to register operator kernels with ONNX Runtime must implement the followingOrtEp::GetKernelRegistry()function:Returns:
OrtStatus*Parameters:
OrtEp* this_ptr: The OrtEp instance.const OrtKernelRegistry** kernel_registry: Output parameter set to the EP's kernel registry, which must remain valid throughout the lifetime of the EP.Remarks: A kernel registry contains kernel creation information for operator kernels supported by an EP.
Note: Implementation of this function is optional. If set to NULL, ORT assumes the EP compiles nodes.
If defined by the EP, the
OrtEp::GetKernelRegistry()function is called by ONNX Runtime after creating an instance of theOrtEpin order to retrieve the EP's kernel registry.APIs used by EP to add entries to kernel registry
An EP's kernel registry (
OrtKernelRegistry) contains information necessary for the (later) creation of operator kernels supported by an EP. Conceptually, a kernel registry contains an array of "kernel creation information" elements, one per operator. Each such element consists of:OrtKernelDef), which specifies operator type, supported versions, type constraints, I/O memory types, etc.OrtKernelCreateFuncthat ORT calls to create an instance of the kernel (OrtKernelImpl).OrtEp) that is passed to theOrtKernelCreateFunc.An EP uses the following
OrtEpApi::KernelRegistry_AddKernel()function to add an entry for one supported operator.Returns:
OrtStatus*Parameters:
OrtKernelRegistry* kernel_registry: The OrtKernelRegistry instance.const OrtKernelDef* kernel_def: The kernel definition, which includes operator type, version, EP name, type constraints, etc.OrtKernelCreateFunc kernel_create_func: Function that creates an instance of the operator kernel as a OrtKernelImpl instance.void* kernel_create_func_state: Custom state passed to the kernel creation function. Can be null.Remarks: Refer to OrtEp::GetKernelRegistry, which returns an EP's kernel registry to ORT.
Building a kernel definition
An EP uses a kernel definition builder (
OrtKernelDefBuilder) to create a kernel definition (OrtKernelDef). The following table lists some of the C APIs related to building a kernel definition. The aboveONNX_OPERATOR_KERNEL_EXmacro uses these APIs.Returns:
OrtStatus*Parameters:
OrtKernelDefBuilder* kernel_def_builder: The OrtKernelDefBuilder instance.const char* op_type: A null-terminated string representing the operator type.Returns:
OrtStatus*Parameters:
OrtKernelDefBuilder* kernel_def_builder: The OrtKernelDefBuilder instance.const char* domain: A null-terminated string representing the operator's domain.Returns:
OrtStatus*Parameters:
OrtKernelDefBuilder* kernel_def_builder: The OrtKernelDefBuilder instance.OrtKernelDef** kernel_def_out: The new OrtKernelDef instance.Defining a kernel implementation
An EP defines a kernel implementation by initializing an instance of
OrtKernelImpl(shown below) with function pointers for computation, release, etc.As shown previously, the example EP creates a
Memcpyclass that inherits fromOrtKernelImpland implements the above functions.Defining a kernel creation function
An EP must provide a function of type
OrtKernelCreateFuncthat ORT can later call to create an instance of a kernel (OrtKernelImpl). The signature of theOrtKernelCreateFuncis shown below.The example EP declares kernel creation functions via use of the previously mentioned
ONNX_OPERATOR_KERNEL_EXmacro. If one were to expand the macro call, the kernel creation function forMemcpyFromHostwould look similar to the following snippet:Motivation and Context