-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TTS]Cantonese FastSpeech2 Training, test=tts #2907
Conversation
|
||
### Training details can refer to the script of examples/aishell3/tts3. | ||
|
||
## Pretrained Model(Waiting========) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
现在模型已经训练完了吧?感觉预训练模型可以放上来了
└── speech_stats.npy # statistics used to normalize spectrogram when training fastspeech2 | ||
``` | ||
You can use the following scripts to synthesize for `${BIN_DIR}/../sentences.txt` using pretrained fastspeech2 and parallel wavegan models. | ||
```bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里等待前端写好之后再更新
str(fp), sr=config.fs, | ||
mono=False) if "canton" in str(fp) else librosa.load( | ||
str(fp), sr=config.fs) | ||
if len(wav.shape) == 2 and "canton" in str(fp): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mfa 的时候要求数据集直接放在 datasets/ 里面,但是此处又要求放在 datasets/canton_all 里面,此处统一下吧,如果这里不好改就改 mfa
|
|
||
# Only used for feats_type != raw | ||
|
||
fmin: 80 # Minimum frequency of Mel basis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmin 是不是也要改成 110,不然感觉会包含了噪声
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
就现在合成效果来看这块问题不大,改的话和预训练的 voc 参数不匹配,可能影响合成效果,这个之后再看
resolved |
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 | ||
fi | ||
|
||
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加入文本前端之后后面这些 stage 可以删掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Cantonese FastSpeech2 Training, test=tts