Coqui Inference Engine is a library for efficiently deploying speech models.
This project is at an early proof-of-concept stage. Collaboration on design and implementation is very welcome. Join our Gitter channel by clicking the badge above!
This project is the successor to the STT "native client", containing the core inference logic for deploying Coqui STT models (and eventually Coqui TTS and other models too).
Coqui Inference Engine aims to be:
- Fast: streaming inference with low latency on small devices (phones, IoT)
- Easy to use: simple, stable, well-documented API
- Available: easy to expose to different programming languages, available in standard package managers
- Extensible: able to handle different model types, architectures, and formats
📰 Subscribe to the 🐸Coqui.ai Newsletter
For the build you'll need to install CMake >= 3.10.
Currently you'll also have to build onnxruntime yourself and place the built files manually before building the inference engine, following the steps below:
$ # Clone the Coqui Inference Engine repo
$ git clone https://github.com/coqui-ai/inference-engine
$ # Clone onnxruntime repo
$ git clone --recursive https://github.com/microsoft/onnxruntime/
$ cd onnxruntime
$ # Build it
$ ./build.sh --config Debug --parallel
$ # Copy built files for inference engine build
$ cp -R build/*/*/libonnxruntime* ../inference-engine/onnxruntime/lib
Now, we're ready to build the inference engine:
$ cd ../inference-engine
$ # Create build dir
$ mkdir build
$ cd build
$ # Prepare build
$ cmake -DCMAKE_BUILD_TYPE=Debug ..
$ # Build
$ make -j
You should now be able to run the test client by running ./main
.
$ ./main --model ../output_graph.onnx --audio ../test-audio.wav
You can use the experimental-inference-engine-export
branch of Coqui STT to export an STT checkpoint in the format expected by the inference engine.
$ git clone --branch experimental-inference-engine-export https://github.com/coqui-ai/STT
$ cd STT
$ python -m pip install -e .
$ cd native_client/ctcdecode
$ make bindings
$ python -m pip install --force-reinstall dist/*.whl
After the steps above, you can then follow the documentation for exporting a model, and include the --export_onnx true
flag. You should then get an output_graph.onnx
file exported which can be read by the inference engine.