Skip to content

Commit 8c1a8b5

Browse files
authored
Merge pull request #3405 from coqui-ai/studio_speakers
Add studio speakers to open source XTTS!
2 parents 934b87b + 8e6a7cb commit 8c1a8b5

File tree

18 files changed

+182
-895
lines changed

18 files changed

+182
-895
lines changed

Diff for: .github/workflows/api_tests.yml

-53
This file was deleted.

Diff for: .github/workflows/zoo_tests_tortoise.yml

-52
This file was deleted.

Diff for: Makefile

-3
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,6 @@ test_zoo: ## run zoo tests.
3535
inference_tests: ## run inference tests.
3636
nose2 -F -v -B --with-coverage --coverage TTS tests.inference_tests
3737

38-
api_tests: ## run api tests.
39-
nose2 -F -v -B --with-coverage --coverage TTS tests.api_tests
40-
4138
data_tests: ## run data tests.
4239
nose2 -F -v -B --with-coverage --coverage TTS tests.data_tests
4340

Diff for: README.md

-31
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,6 @@
77
- 📣 [🐶Bark](https://github.com/suno-ai/bark) is now available for inference with unconstrained voice cloning. [Docs](https://tts.readthedocs.io/en/dev/models/bark.html)
88
- 📣 You can use [~1100 Fairseq models](https://github.com/facebookresearch/fairseq/tree/main/examples/mms) with 🐸TTS.
99
- 📣 🐸TTS now supports 🐢Tortoise with faster inference. [Docs](https://tts.readthedocs.io/en/dev/models/tortoise.html)
10-
- 📣 **Coqui Studio API** is landed on 🐸TTS. - [Example](https://github.com/coqui-ai/TTS/blob/dev/README.md#-python-api)
11-
- 📣 [**Coqui Studio API**](https://docs.coqui.ai/docs) is live.
1210
- 📣 Voice generation with prompts - **Prompt to Voice** - is live on [**Coqui Studio**](https://app.coqui.ai/auth/signin)!! - [Blog Post](https://coqui.ai/blog/tts/prompt-to-voice)
1311
- 📣 Voice generation with fusion - **Voice fusion** - is live on [**Coqui Studio**](https://app.coqui.ai/auth/signin).
1412
- 📣 Voice cloning is live on [**Coqui Studio**](https://app.coqui.ai/auth/signin).
@@ -253,29 +251,6 @@ tts.tts_with_vc_to_file(
253251
)
254252
```
255253

256-
#### Example using [🐸Coqui Studio](https://coqui.ai) voices.
257-
You access all of your cloned voices and built-in speakers in [🐸Coqui Studio](https://coqui.ai).
258-
To do this, you'll need an API token, which you can obtain from the [account page](https://coqui.ai/account).
259-
After obtaining the API token, you'll need to configure the COQUI_STUDIO_TOKEN environment variable.
260-
261-
Once you have a valid API token in place, the studio speakers will be displayed as distinct models within the list.
262-
These models will follow the naming convention `coqui_studio/en/<studio_speaker_name>/coqui_studio`
263-
264-
```python
265-
# XTTS model
266-
models = TTS(cs_api_model="XTTS").list_models()
267-
# Init TTS with the target studio speaker
268-
tts = TTS(model_name="coqui_studio/en/Torcull Diarmuid/coqui_studio", progress_bar=False)
269-
# Run TTS
270-
tts.tts_to_file(text="This is a test.", language="en", file_path=OUTPUT_PATH)
271-
272-
# V1 model
273-
models = TTS(cs_api_model="V1").list_models()
274-
# Run TTS with emotion and speed control
275-
# Emotion control only works with V1 model
276-
tts.tts_to_file(text="This is a test.", file_path=OUTPUT_PATH, emotion="Happy", speed=1.5)
277-
```
278-
279254
#### Example text to speech using **Fairseq models in ~1100 languages** 🤯.
280255
For Fairseq models, use the following name format: `tts_models/<lang-iso_code>/fairseq/vits`.
281256
You can find the language ISO codes [here](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)
@@ -351,12 +326,6 @@ If you don't specify any models, then it uses LJSpeech based English model.
351326
$ tts --text "Text for TTS" --pipe_out --out_path output/path/speech.wav | aplay
352327
```
353328
354-
- Run TTS and define speed factor to use for 🐸Coqui Studio models, between 0.0 and 2.0:
355-
356-
```
357-
$ tts --text "Text for TTS" --model_name "coqui_studio/<language>/<dataset>/<model_name>" --speed 1.2 --out_path output/path/speech.wav
358-
```
359-
360329
- Run a TTS model with its default vocoder model:
361330
362331
```

Diff for: TTS/.models.json

+5-4
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,13 @@
33
"multilingual": {
44
"multi-dataset": {
55
"xtts_v2": {
6-
"description": "XTTS-v2.0.2 by Coqui with 16 languages.",
6+
"description": "XTTS-v2.0.3 by Coqui with 17 languages.",
77
"hf_url": [
88
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/model.pth",
99
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/config.json",
1010
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/vocab.json",
11-
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/hash.md5"
11+
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/hash.md5",
12+
"https://coqui.gateway.scarf.sh/hf-coqui/XTTS-v2/main/speakers_xtts.pth"
1213
],
1314
"model_hash": "10f92b55c512af7a8d39d650547a15a7",
1415
"default_vocoder": null,
@@ -45,7 +46,7 @@
4546
"hf_url": [
4647
"https://coqui.gateway.scarf.sh/hf/bark/coarse_2.pt",
4748
"https://coqui.gateway.scarf.sh/hf/bark/fine_2.pt",
48-
"https://app.coqui.ai/tts_model/text_2.pt",
49+
"https://coqui.gateway.scarf.sh/hf/text_2.pt",
4950
"https://coqui.gateway.scarf.sh/hf/bark/config.json",
5051
"https://coqui.gateway.scarf.sh/hf/bark/hubert.pt",
5152
"https://coqui.gateway.scarf.sh/hf/bark/tokenizer.pth"
@@ -270,7 +271,7 @@
270271
"tortoise-v2": {
271272
"description": "Tortoise tts model https://github.com/neonbjb/tortoise-tts",
272273
"github_rls_url": [
273-
"https://app.coqui.ai/tts_model/autoregressive.pth",
274+
"https://coqui.gateway.scarf.sh/v0.14.1_models/autoregressive.pth",
274275
"https://coqui.gateway.scarf.sh/v0.14.1_models/clvp2.pth",
275276
"https://coqui.gateway.scarf.sh/v0.14.1_models/cvvp.pth",
276277
"https://coqui.gateway.scarf.sh/v0.14.1_models/diffusion_decoder.pth",

0 commit comments

Comments
 (0)