Speaker manager for multi-speaker handling #441

erogol · 2021-04-21T11:22:45Z

SpeakerManager to interface multi-speaker models.
Use SpeakerManager in Synthesizer
Adapt SC-Glow models from @Edresson using SpeakerManager .
Tests for SpeakerManager
Update demo server to use multi-speaker models.
Update synthesize.py to use multi-speaker models.
Update training scripts to use SpeakerManager in multi-speaker case.

Enabling multi-speaker models means;

Let people list speaker of a multi-speaker model.
Choose a certain speaker to generate voice
Let people upload voice samples to mimic the multi-speaker model.

To call the multi-speaker models with this PR;

$ tts --model_name tts_models/en/vctk/sc-glow-tts --list_models  # list the available speakers for the model
$ tts --text "This is a multi-speaker model.." --out_path output.wav  --model_name tts_models/en/vctk/sc-glow-tts --speaker_idx 'p333'     # run the model
$ tts-server --model_name tts_models/en/vctk/sc-glow-tts  # run the demo server

Fix the "Run your own TTS and Vocoder models" snippet

…licate symbols ins the character set)

…ing_dim add inference_noise_scale argument to glow-tts

…of the SC-Glow models

Edresson

Apparently, everything is ok, great addition is much better visually than the old version :)

The [::] address will listen to both ipv4/ipv6 addresses.

…down menu to select the speaker

…file in speaker manager

…r wavs

…o speaker-manager

kirianguiller · 2021-04-28T13:43:43Z

Hi @erogol , you had any trouble for including the #5 in it ?

erogol · 2021-04-28T13:50:53Z

@kirianguiller what do you mean?

I started with your PR. Rebased it and wrote a bunch more :).

AXKuhta and others added 19 commits April 15, 2021 11:53

This snippet was trying to load the model as the config file

993f4ae

Merge pull request #430 from AXKuhta/main

d0d7eae

Fix the "Run your own TTS and Vocoder models" snippet

handle multi speaker and gst in Synthetizer class

48ae52a

add usage of new Synthetizer class in the chinese model notebook

83aa415

refactoring to allow defining the speaker file externally

25328aa

code styling

47e356c

fix a mistake from rebase

1038fd4

set the default layer size compatible with scglow

d9612a4

add unique param to keep scglow models compatible (they are dup…

d2fa8ad

…licate symbols ins the character set)

remove matrix link

d0786be

update synthesize.py for multi-speaker setting

9bccee9

update argument name in server.py

37cad38

add load_chekpoint to speaker encoder

8b40720

update argument name external_speaker_embedding_dim -> speaker_embedd…

8764d02

…ing_dim add inference_noise_scale argument to glow-tts

fix the glow-tts in setup_model

09890c7

initial SpeakerManager implementation

ab31381

formating speakers.py

790946f

add unique argument to make_symbols to fix the incompat. issue …

04b6881

…of the SC-Glow models

use SpeakerManager in Synthesizer

e1d960d

erogol requested a review from Edresson April 21, 2021 11:22

add SpeakerManager tests

757dfb9

erogol added the enhancement General library enhancement. label Apr 21, 2021

erogol added 2 commits April 21, 2021 13:50

Update README.md

39ceb3f

[ci skip] update CONTRIBUTING.md

0ee3eee

Edresson approved these changes Apr 21, 2021

View reviewed changes

erogol and others added 5 commits April 22, 2021 12:38

[ci skip] use prenet_dropout by default with Tacotron models

ef37633

server: also listen to ipv6

f5fd7f7

The [::] address will listen to both ipv4/ipv6 addresses.

Merge branch 'dev' of https://github.com/coqui-ai/TTS into dev

a6cd044

fix windows support

c125b71

fix dumb mistake

355e1f4

erogol added 15 commits April 23, 2021 18:04

update server.py

10c988a

remove moved function

f9f3d04

html formatting, enable multi-speaker model on the server with a drop…

ad047c8

…down menu to select the speaker

load speaker_encoder_ap and compute x_vector directly from the input …

c80d21f

…file in speaker manager

small refactor in server.py

dfa415a

new arguments to synthesize.py for loading speaker encoder and speake…

179722e

…r wavs

let speaker manager compute mean x_vector from multiple wav files

f691957

let synthesizer to pass speaker encoder file paths to speaker manager

7eb0c60

update tests

a878d8f

styling and linting

4cf2113

style and linter fixes

b82daa5

Merge branch 'speaker-manager' of https://github.com/coqui-ai/TTS int…

f37b488

…o speaker-manager

remove conflicy noise

b531fa6

enable multi-speaker CoquiTTS models for synthesize.py

2f07160

place holders for sc-glow and hifigan models

6bdd816

erogol requested a review from Edresson April 26, 2021 17:54

erogol marked this pull request as ready for review April 26, 2021 17:54

erogol added 8 commits April 27, 2021 10:27

bug fix

734e6a5

bump up numpy version

8f0519d

move function and remove import

add97cd

remove imports

4719414

create dummy model on the fly

19d9f58

test for synthesize.py

1235e54

remove test

628abfe

fix test

6353e87

erogol merged commit 6353e87 into dev Apr 27, 2021

erogol mentioned this pull request Apr 28, 2021

Dev pr2 : handle multi-speaker and GST in synthetizer class #5

Closed

erogol deleted the speaker-manager branch April 28, 2021 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speaker manager for multi-speaker handling #441

Speaker manager for multi-speaker handling #441

erogol commented Apr 21, 2021 •

edited

Loading

Edresson left a comment

kirianguiller commented Apr 28, 2021

erogol commented Apr 28, 2021

Speaker manager for multi-speaker handling #441

Speaker manager for multi-speaker handling #441

Conversation

erogol commented Apr 21, 2021 • edited Loading

Edresson left a comment

Choose a reason for hiding this comment

kirianguiller commented Apr 28, 2021

erogol commented Apr 28, 2021

erogol commented Apr 21, 2021 •

edited

Loading