Python wrapper for the excellent phonetisaurus grapheme to phoneme tool (license).
Includes pre-built binaries for:
x86_64
- desktop/laptop/server (64-bit)armv6l
- Raspberry Pi 0/1armv7l
- Raspberry Pi 2/3/4 (32-bit)aarch64
- Raspberry Pi 3/4 (64-bit)
- Python 3.7+
- Linux
- Tested with Debian Buster
For x86_64
systems:
$ pip install phonetisaurus
For Raspberry Pi, see Releases for compatible wheels:
- Raspberry Pi 0/1
phonetisaurus-<VERSION>-py3-none-linux_armv6l.whl
- Raspberry Pi 2/3/4 (32-bit)
phonetisaurus-<VERSION>-py3-none-linux_armv7l.whl
- Raspberry Pi 3/4 (64-bit)
phonetisaurus-<VERSION>-py3-none-linux_aarch64.whl
Assuming you have a lexicon formatted like the CMU pronouncing dictionary:
word1 phoneme1 phoneme2 ...
word2 phoneme1 phoneme2 phoneme3 ...
saved to lexicon.dict
run:
$ phonetisaurus train --model /path/to/write/g2p.fst /path/to/lexicon.dict
You may supply more than one lexicon.
See phonetisaurus train --help
for more options.
$ phonetisaurus predict --model /path/to/g2p.fst word1 word2 ...
If no words are provided on the command line, they will be read line-by-line from standard in.
You may optionally supply one or more --lexicon /path/to/lexicon.dict
arguments to avoid guessing pronunciations for known words.
See phonetisaurus predict --help
for more options.