-
Notifications
You must be signed in to change notification settings - Fork 111
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add doc about how to convert piper models to sherpa-onnx (#516)
- Loading branch information
1 parent
10444b9
commit 3d04d06
Showing
5 changed files
with
198 additions
and
1 deletion.
There are no files selected for viewing
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
#!/usr/bin/env python3 | ||
|
||
import json | ||
import os | ||
from typing import Any, Dict | ||
|
||
import onnx | ||
|
||
|
||
def add_meta_data(filename: str, meta_data: Dict[str, Any]): | ||
"""Add meta data to an ONNX model. It is changed in-place. | ||
Args: | ||
filename: | ||
Filename of the ONNX model to be changed. | ||
meta_data: | ||
Key-value pairs. | ||
""" | ||
model = onnx.load(filename) | ||
for key, value in meta_data.items(): | ||
meta = model.metadata_props.add() | ||
meta.key = key | ||
meta.value = str(value) | ||
|
||
onnx.save(model, filename) | ||
|
||
|
||
def load_config(model): | ||
with open(f"{model}.json", "r") as file: | ||
config = json.load(file) | ||
return config | ||
|
||
|
||
def generate_tokens(config): | ||
id_map = config["phoneme_id_map"] | ||
with open("tokens.txt", "w", encoding="utf-8") as f: | ||
for s, i in id_map.items(): | ||
f.write(f"{s} {i[0]}\n") | ||
print("Generated tokens.txt") | ||
|
||
|
||
def main(): | ||
# Caution: Please change the filename | ||
filename = "en_US-amy-low.onnx" | ||
|
||
# The rest of the file should not be changed. | ||
# You only need to change the above filename = "xxx.onxx" in this file | ||
|
||
config = load_config(filename) | ||
|
||
print("generate tokens") | ||
generate_tokens(config) | ||
|
||
print("add model metadata") | ||
meta_data = { | ||
"model_type": "vits", | ||
"comment": "piper", # must be piper for models from piper | ||
"language": config["language"]["name_english"], | ||
"voice": config["espeak"]["voice"], # e.g., en-us | ||
"has_espeak": 1, | ||
"n_speakers": config["num_speakers"], | ||
"sample_rate": config["audio"]["sample_rate"], | ||
} | ||
print(meta_data) | ||
add_meta_data(filename, meta_data) | ||
|
||
|
||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,4 +12,5 @@ to install `sherpa-onnx`_ before you continue. | |
|
||
./hf-space.rst | ||
./pretrained_models/index | ||
./piper | ||
./faq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
Piper | ||
===== | ||
|
||
In this section, we describe how to convert `piper`_ pre-trained models | ||
from `<https://huggingface.co/rhasspy/piper-voices>`_. | ||
|
||
.. hint:: | ||
|
||
You can find ``all`` of the converted models from `piper`_ in the following address: | ||
|
||
`<https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models>`_ | ||
|
||
If you want to convert your own pre-trained `piper`_ models or if you want to | ||
learn how the conversion works, please read on. | ||
|
||
Otherwise, you only need to download the converted models from the above link. | ||
|
||
Note that there are pre-trained models for over 30 languages from `piper`_. All models | ||
share the same converting method, so we use an American English model in this | ||
section as an example. | ||
|
||
Install dependencies | ||
-------------------- | ||
|
||
.. code-block:: bash | ||
pip install onnx onnxruntime | ||
.. hint:: | ||
|
||
We suggest that you always use the latest version of onnxruntime. | ||
|
||
Find the pre-trained model from piper | ||
------------------------------------- | ||
|
||
All American English models from `piper`_ can be found at | ||
`<https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US>`_. | ||
|
||
We use `<https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/amy/low>`_ as | ||
an example in this section. | ||
|
||
Download the pre-trained model | ||
------------------------------ | ||
|
||
We need to download two files for each model: | ||
|
||
.. code-block:: bash | ||
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/amy/low/en_US-amy-low.onnx | ||
wget https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/amy/low/en_US-amy-low.onnx.json | ||
Add meta data to the onnx model | ||
------------------------------- | ||
|
||
Please use the following code to add meta data to the downloaded onnx model. | ||
|
||
.. literalinclude:: ./code/piper.py | ||
:language: python | ||
|
||
After running the above script, your ``en_US-amy-low.onnx`` is updated with | ||
meta data and it also generates a new file ``tokens.txt``. | ||
|
||
From now on, you don't need the config json file ``en_US-amy-low.onnx.json`` any longer. | ||
|
||
Download espeak-ng-data | ||
----------------------- | ||
|
||
.. code-block:: bash | ||
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2 | ||
tar xf espeak-ng-data.tar.bz2 | ||
Note that ``espeak-ng-data.tar.bz2`` is shared by all models from `piper`_, no matter | ||
which language your are using for your model. | ||
|
||
Test your converted model | ||
------------------------- | ||
|
||
To have a quick test of your converted model, you can use | ||
|
||
.. code-block:: bash | ||
pip install sherpa-onnx | ||
to install `sherpa-onnx`_ and then use the following commands to test your model: | ||
|
||
.. code-block:: bash | ||
# The command "pip install sherpa-onnx" will install several binaries, | ||
# including the following one | ||
which sherpa-onnx-offline-tts | ||
sherpa-onnx-offline-tts \ | ||
--vits-model=./en_US-amy-low.onnx \ | ||
--vits-tokens=./tokens.txt \ | ||
--vits-data-dir=./espeak-ng-data \ | ||
--output-filename=./test.wav \ | ||
"How are you doing? This is a text-to-speech application using next generation Kaldi." | ||
The above command should generate a wave file ``test.wav``. | ||
|
||
.. raw:: html | ||
|
||
<table> | ||
<tr> | ||
<th>Wave filename</th> | ||
<th>Content</th> | ||
<th>Text</th> | ||
</tr> | ||
<tr> | ||
<td>test.wav</td> | ||
<td> | ||
<audio title="Generated ./test.wav" controls="controls"> | ||
<source src="/sherpa/_static/piper/test.wav" type="audio/wav"> | ||
Your browser does not support the <code>audio</code> element. | ||
</audio> | ||
</td> | ||
<td> | ||
How are you doing? This is a text-to-speech application using next generation Kaldi. | ||
</td> | ||
</tr> | ||
</table> | ||
|
||
|
||
Congratulations! You have successfully converted a model from `piper`_ and run it with `sherpa-onnx`_. | ||
|
||
|