executorch is a Rust library for executing PyTorch models in Rust.
It is a Rust wrapper around the ExecuTorch C++ API.
It depends on version 1.0.0 of the Cpp library, and will advance with it.
Create a model in Python and export it:
import torch
from torch.export import export
from executorch.exir import to_edge_transform_and_lower
class Add(torch.nn.Module):
def __init__(self):
super(Add, self).__init__()
def forward(self, x: torch.Tensor, y: torch.Tensor):
return x + y
model = Add()
exported_program = export(model, (torch.ones(1), torch.ones(1)))
executorch_program = to_edge_transform_and_lower(exported_program).to_executorch()
with open("model.pte", "wb") as file:
file.write(executorch_program.buffer)Execute the model in Rust:
use executorch::evalue::{EValue, IntoEValue};
use executorch::module::Module;
use executorch::tensor_ptr;
use ndarray::array;
let mut module = Module::from_file_path("model.pte");
let (tensor1, tensor2) = (tensor_ptr![1.0_f32], tensor_ptr![1.0_f32]);
let inputs = [tensor1.into_evalue(), tensor2.into_evalue()];
let outputs = module.forward(&inputs).unwrap();
let [output]: [EValue; 1] = outputs.try_into().expect("not a single output");
let output = output.as_tensor().into_typed::<f32>();
println!("Output tensor computed: {:?}", output);
assert_eq!(array![2.0], output.as_array());See example/hello_world for a complete example.
To build the library, you need to build the C++ library first.
The C++ library allow for great flexibility with many flags, customizing which modules, kernels, and extensions are built.
Multiple static libraries are built, and the Rust library links to them.
In the following example we build the C++ library with the necessary flags to run example hello_world:
# Clone the C++ library
cd ${EXECUTORCH_CPP_DIR}
git clone --depth 1 --branch v1.0.0 https://github.com/pytorch/executorch.git .
git submodule sync --recursive
git submodule update --init --recursive
# Install requirements
./install_requirements.sh
# Build C++ library
mkdir cmake-out && cd cmake-out
cmake \
-DDEXECUTORCH_SELECT_OPS_LIST=aten::add.out \
-DEXECUTORCH_BUILD_EXECUTOR_RUNNER=OFF \
-DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=OFF \
-DEXECUTORCH_BUILD_PORTABLE_OPS=ON \
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-DEXECUTORCH_ENABLE_PROGRAM_VERIFICATION=ON \
-DEXECUTORCH_ENABLE_LOGGING=ON \
..
make -j
# Static libraries are in cmake-out/
# core:
# cmake-out/libexecutorch.a
# cmake-out/libexecutorch_core.a
# kernels implementations:
# cmake-out/kernels/portable/libportable_ops_lib.a
# cmake-out/kernels/portable/libportable_kernels.a
# extension data loader, enabled with EXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON:
# cmake-out/extension/data_loader/libextension_data_loader.a
# extension module, enabled with EXECUTORCH_BUILD_EXTENSION_MODULE=ON:
# cmake-out/extension/module/libextension_module_static.a
# extension tensor, enabled with EXECUTORCH_BUILD_EXTENSION_TENSOR=ON:
# cmake-out/extension/tensor/libextension_tensor.a
# extension tensor, enabled with EXECUTORCH_BUILD_DEVTOOLS=ON:
# cmake-out/devtools/libetdump.a
# Run example
# We set EXECUTORCH_RS_EXECUTORCH_LIB_DIR to the path of the C++ build output
cd ${EXECUTORCH_RS_DIR}/examples/hello_world
python export_model.py
EXECUTORCH_RS_EXECUTORCH_LIB_DIR=${EXECUTORCH_CPP_DIR}/cmake-out cargo runThe executorch crate will always look for the following static libraries:
libexecutorch.alibexecutorch_core.a
Additional libs are required if feature flags are enabled (see next section):
libextension_data_loader.alibextension_module_static.alibextension_tensor.alibetdump.a
The static libraries of the kernels implementations are required only if your model uses them, and they should be linked manually by the binary that uses the executorch crate.
For example, the hello_world example uses a model with a single addition operation, so it compile the C++ library with DEXECUTORCH_SELECT_OPS_LIST=aten::add.out and contain the following lines in its build.rs:
println!("cargo::rustc-link-lib=static:+whole-archive=portable_kernels");
println!("cargo::rustc-link-lib=static:+whole-archive=portable_ops_lib");
let libs_dir = std::env::var("EXECUTORCH_RS_EXECUTORCH_LIB_DIR").unwrap();
println!("cargo::rustc-link-search=native={libs_dir}/kernels/portable/");Note that the ops and kernels libs are linked with +whole-archive to ensure that all symbols are included in the binary.
The build (and library) is tested on Ubuntu and MacOS, not on Windows.
-
data-loaderIncludes the
FileDataLoaderandMmapDataLoaderstructs. Without this feature the only available data loader isBufferDataLoader. Thelibextension_data_loader.astatic library is required, compile C++executorchwithEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON. -
moduleIncludes the
Modulestruct. Thelibextension_module_static.astatic library is required, compile C++executorchwithEXECUTORCH_BUILD_EXTENSION_MODULE=ON. Also includes thestdfeature. -
tensor-ptrIncludes the
TensorPtrstruct, a smart pointer for tensors that manage the lifetime of the tensor object alongside the lifetimes of the data buffer and additional metadata. Thelibextension_tensor.astatic library is required, compile C++executorchwithEXECUTORCH_BUILD_EXTENSION_TENSOR=ON. Also includes thestdfeature. -
etdumpIncludes the
ETDumpGenstruct, an implementation of anEventTracer, used for debugging and profiling. Thelibetdump.astatic library is required, compile C++executorchwithEXECUTORCH_BUILD_DEVTOOLS=ONandEXECUTORCH_ENABLE_EVENT_TRACER=ON. In addition, theflatcc(orflatcc_d) library is required, available at{CMAKE_DIR}/third-party/flatcc_ep/lib/, and should be linked by the user. -
ndarrayConversions between
executorchtensors andndarrayarrays. Adds a dependency to thendarraycrate. This feature is enabled by default. -
halfAdds a dependency to the
halfcrate, which provides a fully capablef16andbf16types. Without this feature enabled, both of these types are available with a simple conversions to/fromu16only. Note that this only affect input/output tensors, the internal computations always have the capability to operate on such scalars. -
num-complexAdds a dependency to the
num-complexcrate, which provides a fully capable complex number type. Without this feature enabled, complex numbers are available as a simple struct with two public fields without any operations. Note that this only affect input/output tensors, the internal computations always have the capability to operate on such scalars. -
stdEnable the standard library. This feature is enabled by default, but can be disabled to build
executorchin ano_stdenvironment. See theexamples/no_stdexample. Also includes theallocfeature. NOTE: no_std is still WIP, see pytorch/executorch#4561 -
allocEnable allocations. When this feature is disabled, all methods that require allocations will not be compiled. This feature is enabled by the
stdfeature, which is enabled by default. Its possible to enable this feature without thestdfeature, and the allocations will be done using thealloccrate, that requires a global allocator to be set.
By default the std and ndarray features are enabled.