[TTS]Cantonese FastSpeech2 Training, test=tts #2907

WongLaw · 2023-02-10T12:24:34Z

Cantonese FastSpeech2 Training, test=tts

examples/canton/tts3/local/preprocess.sh

yt605155624 · 2023-02-13T04:38:04Z

examples/canton/tts3/README.md

+
+### Training details can refer to the script of examples/aishell3/tts3.
+
+## Pretrained Model(Waiting========)


现在模型已经训练完了吧？感觉预训练模型可以放上来了

yt605155624 · 2023-02-13T04:38:30Z

examples/canton/tts3/README.md

+└── speech_stats.npy        # statistics used to normalize spectrogram when training fastspeech2
+```
+You can use the following scripts to synthesize for `${BIN_DIR}/../sentences.txt` using pretrained fastspeech2 and parallel wavegan models.
+```bash


这里等待前端写好之后再更新

examples/canton/tts3/path.sh

yt605155624 · 2023-02-13T04:40:57Z

paddlespeech/t2s/exps/fastspeech2/preprocess.py

+            str(fp), sr=config.fs,
+            mono=False) if "canton" in str(fp) else librosa.load(
+                str(fp), sr=config.fs)
+        if len(wav.shape) == 2 and "canton" in str(fp):


mfa 的时候要求数据集直接放在 datasets/ 里面，但是此处又要求放在 datasets/canton_all 里面，此处统一下吧，如果这里不好改就改 mfa

examples/canton/tts3/README.md

examples/canton/tts3/local/train.sh

examples/canton/tts3/local/synthesize.sh

examples/canton/tts3/conf/default.yaml

examples/canton/tts3/local/preprocess.sh

yt605155624 · 2023-02-13T12:07:24Z

PaddleSpeech/paddlespeech/t2s/exps/fastspeech2/preprocess.py

Line 104 in 66a9cf8

f0 = pitch_extractor.get_pitch(wav, duration=np.array(durations))

对于 f0 全为 0 的情况 return None 进行过滤
PaddleSpeech/paddlespeech/t2s/datasets/get_feats.py

Line 105 in 66a9cf8

print("All frames seems to be unvoiced.")

这里 print 信息改成 All frames seems to be unvoiced, this utt will be removed.

yt605155624 · 2023-02-13T13:35:16Z

examples/canton/tts3/conf/default.yaml

+
+# Only used for feats_type != raw
+
+fmin: 80           # Minimum frequency of Mel basis.


fmin 是不是也要改成 110，不然感觉会包含了噪声

就现在合成效果来看这块问题不大，改的话和预训练的 voc 参数不匹配，可能影响合成效果，这个之后再看

yt605155624 · 2023-02-14T05:12:18Z

PaddleSpeech/paddlespeech/t2s/exps/fastspeech2/preprocess.py

Line 104 in 66a9cf8

f0 = pitch_extractor.get_pitch(wav, duration=np.array(durations))

对于 f0 全为 0 的情况 return None 进行过滤

PaddleSpeech/paddlespeech/t2s/datasets/get_feats.py

Line 105 in 66a9cf8

print("All frames seems to be unvoiced.")

这里 print 信息改成 All frames seems to be unvoiced, this utt will be removed.

resolved

yt605155624 · 2023-02-14T06:06:54Z

examples/canton/tts3/run.sh

+    CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
+fi
+
+if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then


加入文本前端之后后面这些 stage 可以删掉

yt605155624

LGTM

WongLaw added 18 commits October 20, 2022 11:06

Merge branch 'PaddlePaddle:develop' into develop

f90e242

Merge branch 'PaddlePaddle:develop' into develop

b2597bc

Merge branch 'PaddlePaddle:develop' into develop

0ab03d8

Merge branch 'PaddlePaddle:develop' into develop

a20ca46

Merge branch 'PaddlePaddle:develop' into develop

26a6fb5

Add rhythm tags for MFA, test=tts

0cee810

Add rhythm tags for MFA, test=tts

d7e2931

Merge branch 'PaddlePaddle:develop' into develop

832ff0e

Revised Rhythm label for MFA, test=tts

7939884

Merge branch 'PaddlePaddle:develop' into develop

05447ea

Merge branch 'PaddlePaddle:develop' into develop

f28d0a1

Merge branch 'PaddlePaddle:develop' into develop

a6adcc4

Merge branch 'PaddlePaddle:develop' into develop

696ee80

Merge branch 'PaddlePaddle:develop' into develop

5525468

Merge branch 'PaddlePaddle:develop' into develop

2430e13

Merge branch 'PaddlePaddle:develop' into develop

0ab7fb0

Merge branch 'PaddlePaddle:develop' into develop

859e8d2

Cantonese FastSpeech2 Training, test=tts

3131c9b

WongLaw added the T2S label Feb 10, 2023

WongLaw added this to the r1.4.0 milestone Feb 10, 2023

WongLaw requested a review from yt605155624 February 10, 2023 12:24

WongLaw self-assigned this Feb 10, 2023

mergify bot added Example README labels Feb 10, 2023

Cantonese FastSpeech2 Training, test=tts

4f144fa

yt605155624 changed the title ~~Cantonese FastSpeech2 Training, test=tts~~ [TTS]Cantonese FastSpeech2 Training, test=tts Feb 13, 2023