Status | Accepted |
---|---|
Author(s) | James Ring ([email protected]). |
Sponsor | Günhan Gülsoy ([email protected]) |
Updated | 2020-06-02 |
Tensorflow (TF) currently provides a C++ API for implementing kernels and ops. The Voltron project aims to create a modular/plugin-based TF implementation with API and ABI surfaces. Plugins will be able to create and register custom kernel and op implementations.
In order to provide a stable ABI, the Voltron team has chosen to provide C APIs to plugin authors. This document introduces the C API for op and kernel registration. For authors who wish to continue using C++ to interface with TensorFlow, an ABI-stable C++ header-only API is provided.
Presently, there is no ABI-stable API for extending TensorFlow with new kernels and ops. There is no guarantee that a plugin written with one compiler will work with a version of TensorFlow built with another, even on the same operating system and architecture. This makes it difficult to distribute plugins without also distributing the source code and requiring end-users to build the plugin alongside TensorFlow.
An ABI-stable API for extending TensorFlow will simplify the distribution of plugins and allow plugin authors to distribute binary artifacts without necessarily publishing plugin source code.
Plugin authors will be able to publish plugins that users can use more easily. In turn, the TensorFlow community will benefit from an increase in the number of variety of available plugins.
In general, the kernel and op registration C APIs aim to permit the implementation of any kernel or op that is currently possible with the C++ API. Where possible, existing C++ function implementations are reused from within a C wrapper. The purpose of the wrapper is simply to provide ABI stability.
Since plugins will be dynamically loaded (e.g. via dlopen
on POSIX), the API
avoids relying on static initialization.
The intention is that existing kernels should be able to be ported to the new APIs with a minimum of reimplementation effort. This precludes a from-scratch re-imagining of TensorFlow APIs.
The following diagram describes the components built with the proposed C and C++ APIs.
+----------------+ <--+
| | |
| Plugin | |
| | |
+----------------+ |
| | |
| C++ header API | | Plugin
| | | my_plugin.so
+--> +----------------+ |
| | | |
| | C API headers | |
| | | |
| +----------------+ <--+
| | |
| | C API impl |
Core | | |
Tensorflow | +----------------+
libtf.so | | |
| | Core C++ APIs |
| | |
+--> +----------------+
In this example, there are two object files: my_plugin.so
and
libtensorflow.so
. my_plugin.so
is implemented in terms of the C++
header-only API, which is in turn implemented in terms of the C API headers. The
C API implementation is provided by TensorFlow at runtime when it loads the
plugin's shared object.
This design addresses changes that are required to the existing C API that are required to support op and kernel plugins. It also introduces the C++ header-only API, which currently does not exist.
This section introduces changes to the C API that are required to support ops.
An alpha version of this API is already checked in at tensorflow/c/ops.h
.
In the C++ API, ops are registered at static initialization time using the
REGISTER_OP
macro. For example:
REGISTER_OP("Bitcast")
.Input("input: T")
.Output("output: type")
.Attr("T: {bfloat16, ...}")
.Attr("type: {bfloat16, ...}")
.SetShapeFn([](InferenceContext* ctx) { ... })
.Doc("A bitcast operator");
The equivalent C API will be a series of functions that operate on
TF_OpDefinitionBuilder *
, a pointer to an opaque struct (i.e. a struct whose
content is not made known to the user). The functions include, but are not
limited to:
-
TF_OpDefinitionBuilder* TF_NewOpDefinitionBuilder(const char* op_name)
: constructs and returns a new op registration builder for an op with the given name -
void TF_OpDefinitionBuilderAddAttr(TF_OpDefinitionBuilder* builder, const char* attr)
: adds the given attribute to the builder (equivalent toAttr
above) -
void TF_OpDefinitionBuilderAddInput(TF_OpDefinitionBuilder* builder, const char* input)
: adds the given input to the builder (equivalent toInput
above)
Additional functions are provided for setting other properties of the operation
(e.g. TF_OpDefinitionBuilderSetIsCommutative
).
Registration is then actually performed using the TF_RegisterOpDefinition
function. This function populates a TF_Status
indicating whether registration
was successful and frees the resources associated with the op definition
builder.
The C equivalent of the bitcast op registration example above is shown below:
#include "tensorflow/c/ops.h"
void InferBitcastShape(TF_ShapeInferenceContext* ctx, // see the section below on
TF_Status* status); // shape inference
void InitPlugin() {
TF_OpDefinitionBuilder* b = TF_NewOpDefinitionBuilder("Bitcast");
TF_OpDefinitionBuilderAddInput(b, "input: T");
TF_OpDefinitionBuilderAddOutput(b, "output: type");
TF_OpDefinitionBuilderAddAttr(b, "T: {bfloat16, ...}");
TF_OpDefinitionBuilderAddAttr(b, "type: {bfloat16, ...}");
TF_OpDefinitionBuilderSetShapeInferenceFunction(b, &InferBitcastShape);
TF_Status* status = TF_NewStatus();
TF_RegisterOpDefinition(b, status);
if (TF_GetCode(status) != TF_OK) { /* handle errors */ }
}
A significant feature of certain ops is their ability to infer their output shapes. TensorFlow will invoke the registered shape inference function (if one is provided) when it needs to know the op's output shape. The registration function declaration is shown below:
void TF_OpDefinitionBuilderSetShapeInferenceFunction(
TF_OpDefinitionBuilder* builder,
void (*shape_inference_func)(TF_ShapeInferenceContext* ctx, TF_Status* status));
A series of functions prefixed with TF_ShapeInferenceContext
is provided for
the following purposes:
-
Examining operator input shapes (
TF_ShapeInferenceContextGetInput
) -
Creating and deleting shape and dimension handles (
TF_{New,Delete}ShapeHandle
,TF_{New,Delete}DimensionHandle
) -
Manipulating shape and dimension handles (
TF_ShapeInferenceContextWithRank
,TF_ShapeInferenceContextDim
)
In general, C analogues to the C++ methods in tensorflow::shape_inference
(see tensorflow/core/framework/shape_inference.h
) will be provided.
This section introduces changes to the C API that are required to support
kernels. An alpha version of this API is already checked in at
tensorflow/c/kernels.h
.
Kernel registration with the C++ API is accomplished with the
REGISTER_KERNEL_BUILDER
macro. This macro expands to code that relies on
static initialization to register the provided kernel with the global kernel
registry. See below for an example of registering a kernel with the C++ API:
#include "tensorflow/core/framework/op_kernel.h"
class BitcastOp : public OpKernel {
explicit BitcastOp(OpKernelConstruction* context) : OpKernel(context) { … }
void Compute(OpKernelContext* context) override { … }
};
REGISTER_KERNEL_BUILDER(Name("Bitcast").Device(DEVICE_CPU), BitcastOp)
The equivalent C API provides a series of functions that operate on
TF_KernelBuilder
, an opaque struct obtained with the TF_NewKernelBuilder
call.
The kernel builder is registered with TensorFlow using the
TF_RegisterKernelBuilder
function. See below for an example of registering
the bitcast kernel using the C API:
#include "tensorflow/c/kernels.h"
typedef struct bitcast_kernel { … } bitcast_kernel;
// Bitcast_Create, Bitcast_Compute and Bitcast_Delete actually implement the
// kernel. See the section below for discussion on kernel implementation.
static void* Bitcast_Create(TF_OpKernelConstruction* context) {
bitcast_kernel* k = (bitcast_kernel*) calloc(1, sizeof(bitcast_kernel));
/* initialize the fields of k as needed */
return (void*) k;
}
static void* Bitcast_Compute(void* k, TF_OpKernelContext* context) {
bitcast_kernel* kernel = (bitcast_kernel*) k; // this is the pointer returned by
// Bitcast_Create
/* compute the result */
TF_SetOutput(context, ...);
}
static void Bitcast_Delete(void *k) { free(k); }
void InitPlugin() {
TF_KernelBuilder* builder = TF_NewKernelBuilder(/*op_name*/"Bitcast", DEVICE_CPU,
&Bitcast_Create, &Bitcast_Compute, &Bitcast_Delete);
TF_Status* status = TF_NewStatus();
TF_RegisterKernelBuilder(/*kernel_name*/"Bitcast", builder, status);
if (TF_GetCode(status) != TF_OK) { /* handle errors */ }
TF_DeleteStatus(status);
}
The registration function prototypes are provided below. Kernel authors must provide a compute function. Creation and deletion functions are optional, but if a creation function is provided that causes memory allocation, a deletion function that frees the memory should also be provided, otherwise a leak will occur.
TF_KernelBuilder* TF_NewKernelBuilder(
const char* op_name, const char* device_name,
void* (*create_func)(TF_OpKernelConstruction*),
void (*compute_func)(void*, TF_OpKernelContext*),
void (*delete_func)(void*));
void TF_RegisterKernelBuilder(const char* name, TF_KernelBuilder* builder,
TF_Status* status);
The main classes for C++ kernel implementations are OpKernelCreation
(provided by TensorFlow to the kernel constructor) and OpKernelContext
(provided to the kernel's Compute
method). The analogues in the C API are
TF_OpKernelCreation
and TF_OpKernelContext
. The aim of the C API is to
provide functions for working with these structs that match, as closely as
possible, the C++ API.
Kernels must be able to retrieve their inputs and provide outputs. In the C++
API, the tensorflow::OpKernelContext::GetInput and SetOutput family of
functions provide this functionality. The equivalent C calls will be
TF_GetInput
and TF_SetInput
. These functions operate on TF_Tensor
, which
is already part of the existing TensorFlow C API.
String tensors will be supported in an ABI-stable way. This will require changes to their binary representation described in the tstring design document.
As described above, the main motivation for providing a C API is ABI stability.
However, some programmers may find the C API less convenient than the
non-ABI-stable C++ API. To address this concern, we plan to provide a
header-only C++ API that is implemented in terms of the ABI-stable C API. This
API will contain classes such as Tensor
, OpKernelContext
, and
OpKernelConstruction
, whose names will be familiar to existing C++ API users.
Ideally, this API will be as close as possible to the existing non-ABI-stable
Tensorflow C++ API, so that kernels and ops currently implemented in C++ may be
ported to the ABI-stable C++ with as little implementation churn as possible.