The first step in deploying the trained keyword spotting models on microcontrollers is quantization, which is described here. This directory consists of example codes and steps for running a quantized DNN model on any Cortex-M board using mbed-cli and CMSIS-NN library. It also consists of an example of integration of the KWS model onto a Cortex-M development board with an on-board microphone to demonstrate keyword spotting on live audio data.
Clone CMSIS-5 library, which consists of the optimized neural network kernels for Cortex-M.
cd Deployment
git clone https://github.com/ARM-software/CMSIS_5.git
Install mbed-cli and its python dependencies.
pip install mbed-cli
In this example, the KWS inference is run on the audio data provided through a .h file. First create a new project and install any python dependencies prompted when project is created for the first time after the installation of mbed-cli.
mbed new kws_simple_test --mbedlib
Fetch the required mbed libraries for compilation.
cd kws_simple_test
mbed deploy
Compile the code for the mbed board (for example NUCLEO_F411RE).
mbed compile -m NUCLEO_F411RE -t GCC_ARM --source . \
--source ../Source/KWS --source ../Source/NN --source ../Source/MFCC \
--source ../Source/local_NN --source ../Examples/simple_test \
--source ../CMSIS_5/CMSIS/NN/Include --source ../CMSIS_5/CMSIS/NN/Source \
--source ../CMSIS_5/CMSIS/DSP/Include --source ../CMSIS_5/CMSIS/DSP/Source \
--source ../CMSIS_5/CMSIS/Core/Include \
--profile ../release_O3.json -j 8
Copy the binary (.bin) to the board (Make sure the board is detected and mounted). Open a serial terminal (e.g. putty or minicom) and see the final classification output on screen.
cp ./BUILD/NUCLEO_F411RE/GCC_ARM/kws_simple_test.bin /media/<user>/NODE_F411RE/
sudo minicom
Run KWS inference on live audio on STM32F746NG development kit
This example runs keyword spotting inference on live audio captured using the on-board microphones on the STM32F746NG discovery kit. When performing keyword spotting on live audio data with multiple noise sources, outputs are typically averaged over a specified window to generate smooth predictions. The averaging window length and the detection threshold (which may also be different for each keyword) are two key parameters in determining the overall keyword spotting accuracy and user experience.
mbed new kws_realtime_test --create-only
cd kws_realtime_test
cp ../Examples/realtime_test/mbed_libs/*.lib .
mbed deploy
mbed compile -m DISCO_F746NG -t GCC_ARM \
--source . --source ../Source --source ../Examples/realtime_test \
--source ../CMSIS_5/CMSIS/NN/Include --source ../CMSIS_5/CMSIS/NN/Source \
--source ../CMSIS_5/CMSIS/DSP/Include --source ../CMSIS_5/CMSIS/DSP/Source \
--source ../CMSIS_5/CMSIS/Core/Include \
--profile ../release_O3.json -j 8
cp ./BUILD/DISCO_F746NG/GCC_ARM/kws_realtime_test.bin /media/<user>/DIS_F746NG/
Build an example on FRDM-K64F using gcc and make
To build this example, clone CMSIS_5 repository and then make
. This example is created by exporting a simple hello-world example from mbed online compiler and editing the Makefile to incorporate the source files required for the keyword spotting example.
cd Deployment
# Clone CMSIS_5 repository (if not done already)
git clone https://github.com/ARM-software/CMSIS_5.git
cd Examples/simple_test_k64f_gcc
make -j 8
# copy binary to the device
cp ./BUILD/simple_test_k64f_gcc.bin /media/<user>/DAPLINK/
Note: The examples provided use floating point operations for MFCC feature extraction, but it should be possible to convert them to fixed-point operations for deploying on microcontrollers that do not have dedicated floating point units.