Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MatchaTTS for the Chinese dataset Baker #1849

Merged
merged 29 commits into from
Dec 31, 2024

Conversation

csukuangfj
Copy link
Collaborator

Usage

cd egs/baker_zh/TTS

./prepare.sh

python3 ./matcha/train.py \
  --exp-dir ./matcha/exp-1/ \
  --num-workers 4 \
  --world-size 1 \
  --num-epochs 2000 \
  --max-duration 1200 \
  --bucketing-sampler 1 \
  --start-epoch 1

You can tune --num-epochs and --max-duration.

Will upload the model after it finished training.


Its data/tokens.txt likes below:

(py38) kuangfangjun:TTS$ wc -l data/tokens.txt
2069 data/tokens.txt
(py38) kuangfangjun:TTS$ head -n 20 data/tokens.txt
  0
_ 1
, 2
. 3
! 4
? 5
: 6
" 7
' 8
a 9
a1 10
a2 11
a3 12
a4 13
ai 14
ai1 15
ai2 16
ai3 17
ai4 18
an 19
(py38) kuangfangjun:TTS$ tail -n20 ./data/tokens.txt
zuan 2049
zuan1 2050
zuan2 2051
zuan3 2052
zuan4 2053
zui 2054
zui1 2055
zui2 2056
zui3 2057
zui4 2058
zun 2059
zun1 2060
zun2 2061
zun3 2062
zun4 2063
zuo 2064
zuo1 2065
zuo2 2066
zuo3 2067
zuo4 2068

tokens.txt

@csukuangfj
Copy link
Collaborator Author

csukuangfj commented Dec 27, 2024

Pre-trained checkpoints and logs can be found at
https://huggingface.co/csukuangfj/icefall-tts-baker-matcha-zh-2024-12-27


python3 ./matcha/infer.py \
  --epoch 2000 \
  --exp-dir ./matcha/exp-1 \
  --vocoder ./generator_v2 \
  --tokens ./data/tokens.txt \
  --cmvn ./data/fbank/cmvn.json \
  --input-text "当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔。" \
  --output-wav ./generated.wav

generates the following wav

generated.mov

python3 ./matcha/onnx_pretrained.py \
  --acoustic-model ./model-steps-4.onnx \
  --vocoder ./hifigan_v2.onnx \
  --tokens ./data/tokens.txt \
  --lexicon ./lexicon.txt \
  --input-text "在一个阳光明媚的夏天,小马、小羊和小狗它们一块儿在广阔的草地上,嬉戏玩耍,这时小猴来了,还带着它心爱的足球活蹦乱跳地跑前、跑后教小马、小羊、小狗踢足球。" \
  --output-wav ./1.wav
1.mov

@csukuangfj csukuangfj requested a review from JinZr December 30, 2024 10:13
@csukuangfj csukuangfj merged commit bfffda5 into k2-fsa:master Dec 31, 2024
9 checks passed
@csukuangfj csukuangfj deleted the baker-matcha branch December 31, 2024 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant