[TTS]Update VITS to support VITS and its voice cloning training on AIShell-3 #2268

HighCWu · 2022-08-18T12:05:46Z

PR types

New features | Bug fixes

PR changes

APIs | Docs

Describe

New features:

Add voice conversion inference for VITS.
Add VITS multi speaker and voice cloning training examples for AIShell-3 dataset.
VITS voice cloning can now use text or audio as input. Due to VITS flow reversibility, it can remove speaker timbre in forward step, and add another speaker timbre in reverse step. For details, see the implementation of the voice conversion. This implementation is referenced from VITS official https://github.com/jaywalnut310/vits/blob/2e561ba58618d021b5b8323d3765880f7e0ecfdb/models.py#L525

Bug fixes:
VITS inference code:
feats_lengths = paddle.to_tensor([paddle.shape(feats)[2]]) -> feats_lengths = paddle.to_tensor(paddle.shape(feats)[2])

Extra:
I added the docs for traning VITS or VITS-VC on aishell-3 dataset, but leave a TODO label for pretrained model. I trained the two new examples with a little modified on aistudio single v100 16GB card with batch_size=24 for 25000 step. This is to verify that training can be performed normally.
I hope the official could release the official version of the pre-trained model trained on 4 cards in the future.

Here are some outputs of my 25000 step models:
test_vits.zip
test_e2e_vits.zip
test_vits_vc.zip
vc_syn_vits_vc_src_text.zip
vc_syn_vits_vc_src_audio.zip

CLAassistant · 2022-08-18T12:05:52Z

All committers have signed the CLA.

yt605155624 · 2022-08-18T12:12:48Z

Thank you for your contribution. I've been busy publishing the version recently. I'll have a large amount of time to review your code at the end of September.

If you are in Chinese Mainland, you can scan the wechat QR code on the homepage to join our user group and add '子龙' for discussion.

mergify · 2022-08-26T14:49:29Z

This pull request is now in conflict :(

…into develop

yt605155624

LGTM

HighCWu added 6 commits August 15, 2022 21:47

code for training vits voice clone on aishell3.

4d871fe

Merge branch 'PaddlePaddle:develop' into develop

b05e88b

fix filemode on vits examples.

2ebe04f

fix voice cloning of vits.

1450e74

update readme for vits.

227ff5d

Merge branch 'PaddlePaddle:develop' into develop

0162b25

mergify bot added T2S Example README labels Aug 18, 2022

yt605155624 self-requested a review August 18, 2022 12:10

yt605155624 assigned HighCWu Aug 18, 2022

yt605155624 added this to the r1.2.0 milestone Aug 18, 2022

yt605155624 requested a review from lym0302 August 18, 2022 12:19

yt605155624 changed the title ~~Update VITS to support VITS and its voice cloning training on AIShell-3~~ [TTS]Update VITS to support VITS and its voice cloning training on AIShell-3 Aug 18, 2022

yt605155624 added the contributor label Aug 22, 2022

yt605155624 mentioned this pull request Aug 22, 2022

请问后面会支持多 speaker 的 vits 模型吗？ #2282

Closed

mergify bot added the conflicts label Aug 26, 2022

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleSpeech …

94dbb78

…into develop

mergify bot removed the conflicts label Sep 1, 2022

yt605155624 added 4 commits September 1, 2022 07:46

fix batch_size, gpus

51206cc

fix windows format to linux

fc60f17

update readme, test=tts

e06a09a

update readme, test=tts

5d100e3

yt605155624 approved these changes Sep 5, 2022

View reviewed changes

yt605155624 merged commit ea9ee93 into PaddlePaddle:develop Sep 5, 2022

lym0302 mentioned this pull request Sep 26, 2022

🔥 r1.2.0 release note #2452

Closed

yt605155624 mentioned this pull request Jan 11, 2023

[TTS] 训练一套多说话人 VITS 的模型参数 #2823

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TTS]Update VITS to support VITS and its voice cloning training on AIShell-3 #2268

[TTS]Update VITS to support VITS and its voice cloning training on AIShell-3 #2268

HighCWu commented Aug 18, 2022

CLAassistant commented Aug 18, 2022 •

edited

Loading

yt605155624 commented Aug 18, 2022

mergify bot commented Aug 26, 2022

yt605155624 left a comment

[TTS]Update VITS to support VITS and its voice cloning training on AIShell-3 #2268

[TTS]Update VITS to support VITS and its voice cloning training on AIShell-3 #2268

Conversation

HighCWu commented Aug 18, 2022

PR types

PR changes

Describe

CLAassistant commented Aug 18, 2022 • edited Loading

yt605155624 commented Aug 18, 2022

mergify bot commented Aug 26, 2022

yt605155624 left a comment

Choose a reason for hiding this comment

CLAassistant commented Aug 18, 2022 •

edited

Loading