🤖 AUTO TTS #696

loganhart02 · 2021-08-03T00:35:52Z

This is just something I made on my personal fork for really quick testing of different models for my datasets and thought this same style would be cool for a sort of recipe api. All the model configs are either based off pre trained models or current recipes, haven't really messed around with them and I have only tested models for one epoch as my gpu is training something else. also added a data loader tool that will load the dataset with the proper audio configs. That is also just based off pre trained model configs. It's pretty easy to add more recipes. Let me know what you think and I can work on adding more recipes, also planning on making a vocoder trainer and a speaker encoder trainer.

* Update distribute.py Simple fix to make distributed training work with python runner file * Update distribute.py * Fix linter errors

Fix json formatting error

…y only made ljspeech recipes

CLAassistant · 2021-08-03T08:41:58Z

All committers have signed the CLA.

erogol · 2021-08-04T08:38:57Z

This is quite impressive 🚀

Do you think we can also create a console end-point so people run these pre-defines recipes on the terminal? We can even call it AutoTTS :)

TTS/recipe_api/complete_recipes.py

loganhart02 · 2021-08-05T02:02:52Z

This is quite impressive rocket

Do you think we can also create a console end-point so people run these pre-defines recipes on the terminal? We can even call it AutoTTS :)

AutoTTS sounds wayyyyyyyy cooler than recipe api! I'm currently working on a python script to let you just define everything from the dataset you want to train on to the model you want to use as command line args, I'm just debugging the last bit it of it currently. I'm also adding a notebook for people who use google colab.

…r sam dataset, also added options so users can turn on forward and loction attention when they want to.

…tead of calling a function) added more ljspeech vocoder models(these are just baselines though, once I have a free gpu im going to test them and see what can be changed to make them better.)

loganhart02 · 2021-08-19T22:59:11Z

I'm pretty content with this being the first version of the auto tts for now. Obviously as this repo grows and more models get implemented I'll continue to add but right now I want to focus on trying to make a script to export models to onnx and get them to run on onnxruntime and tensorrt(I need to do this for my own project so might as well just make a script everyone can use), and try to work on batch inference, so idk when ill add more to this, let me know if anything needs changed before it can pushed to main

erogol · 2021-08-23T12:05:00Z

@loganhart420 I guess you are up for the review right. Do you plan any other change?

loganhart02 · 2021-08-23T12:11:23Z

as of right now, no, I’m working on other features before I change anything to this.

…

Sent from my iPhone

On Aug 23, 2021, at 8:05 AM, Eren Gölge ***@***.***> wrote: @loganhart420 I guess you are up for the review right. Do you plan any other change? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

erogol · 2021-08-25T13:50:29Z

just a heads up,I am going to start reviewing the PR next week hopefully after solving a bunch of bugs.

erogol · 2021-08-31T13:14:41Z

TTS/auto_tts/complete_recipes.py

+            epochs=self.epochs,
+        )
+
+    def ljspeechAutoTts(


Better to use _ notation instead of Camel notation to comply with the rest of the code base.

erogol · 2021-08-31T13:16:41Z

TTS/auto_tts/complete_recipes.py

+    """This is trainer for calling complete recipes based off public datasets.
+    all configs are based off pretrained model configs or the model papers.
+
+    usage:


Examples: instead of usage:

It'd be nice to add Args: and type annotations in docstrings too

erogol · 2021-08-31T13:20:49Z

TTS/auto_tts/complete_recipes.py

+
+    def SamAccentureAutoTts(self, model_name, tacotron2_model_type, forward_attention=False, location_attention=True):
+        """Tacotron2 recipes for the sam dataset, based off the pre trained model."""
+        if model_name == "tacotrn2":


erogol · 2021-08-31T13:23:18Z

TTS/auto_tts/complete_recipes.py

+        trainer = Trainer(args, config, output_path, c_logger, tb_logger)
+        return trainer
+
+    def vctkAutoTts(self, model_name, speaker_file, glowtts_encoder):


Maybe rather than defining different functions for each dataset you can make the dataset an argument to the function. AFAIS, the only difference between functions is the choice of datasets.

erogol · 2021-08-31T13:27:55Z

TTS/auto_tts/example.py

@@ -0,0 +1,14 @@
+from TTS.auto_tts.complete_recipes import Examples


Maybe call this AutoTrainer or just Trainer

erogol · 2021-08-31T13:29:49Z

TTS/auto_tts/model_hub.py

+    def single_speaker_tacotron2_base(
+        self, audio, dataset, dla=0.25, pla=0.25, ga=5.0, forward_attn=True, location_attn=True
+    ):
+        config = Tacotron2Config(


How about fetching these configs from the real recipes when exist to reduce the decoupling

erogol · 2021-08-31T13:32:14Z

TTS/auto_tts/model_hub.py

+
+    def ljspeech_speedy_speech(self, audio, dataset):
+        """Base speedy speech model for ljpseech dataset."""
+        model_args = SpeedySpeechArgs(


SpeedySpeech is tricky to train since it needs character durations precomputed. You either compute them externally or train a Tacotron model first to compute durations. Maybe Speedy Speecy training should first start with the Tacotron training and compute the durations. But also it sounds like a lot of clutter in the code.

Yea I saw that when testing that out, for now I'm just going to leave it out until I create a stable and clean experiment function

erogol · 2021-10-12T12:56:44Z

The PR looks awesome. The abstraction you put on the training is great for especially non-technical users. I've put some comments above.

erogol · 2021-10-12T12:59:02Z

Also, what is the use-case for AutoTTS in your mind? My thinking, it targets especially non-technical users who want to train a new model on a custom dataset. That means actually we don't know if the default values are the best values for his dataset. Let's say he trained the first model with the default values then what should be the next step? Do you have an idea?

loganhart02 · 2021-10-12T18:06:51Z

Hey, I’ve read through the comments you made and plan on making the necessary changes within the week. For the data formatters I agree with just trying to get them to format their data into the ljspeech format, I’ll probably make some helper functions to make that process easier if it’s necessary as this is a higher level abstraction for non technical users. making auto tts a separate repo is probably a good idea because, although I’ve only added small features in this this PR I’ve got a million ideas and features I want to add. You are correct that this is a platform for non technical users to fine tune on their data, I’ve only had a computer for a little over a year so I’ve only been learning programming and deep learning for that long so I don’t know how correct this is but from what I’ve seen is that most models are pretty standard and you only have to tweak certain parameters to get it to train well on new data(correct me if I’m wrong about this), so my end goal is had those parameters mutable and have defaults for all, a user can then train using defaults and if results are bad can tweak parameters until the model produces good audio. I also plan on making some tools to automatically find good parameters for example a learning rate finder(what you see in pytorch lighting and fastai). obviously I have a million ideas for helper tools and features but I’m only going to implement them as I progress through building the whole platform. I also like the idea of making it a separate repo because I always wanted more people to contribute to so people can give feedback and add their own features that I otherwise wouldn’t have thought of if that makes sense.

…

Sent from my iPhone

On Oct 12, 2021, at 8:59 AM, Eren Gölge ***@***.***> wrote: Also, what is the use-case for AutoTTS in your mind? My thinking, it targets especially non-technical users who want to train a new model on a custom dataset. That means actually we don't know if the default values are the best values for his dataset. Let's say he trained the first model with the default values then what should be the next step? Do you have an idea? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

erogol · 2021-10-14T14:33:45Z

Are you on the Gitter/Matrix channel?

loganhart02 · 2021-10-14T22:55:34Z

I am not if you’d like me to get on it I’d be more than happy to

…

Sent from my iPhone

On Oct 14, 2021, at 10:33 AM, Eren Gölge ***@***.***> wrote: Are you on the Gitter/Matrix channel? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

erogol · 2021-10-14T23:03:59Z

It'd be nice to get on there so that we can talk in detail :)

…added functions to pick forward tts encoder and decoder

v0.4.1

� Conflicts: � docs/source/tutorial_for_nervous_beginners.md � recipes/ljspeech/fast_pitch/train_fast_pitch.py � recipes/ljspeech/glow_tts/train_glowtts.py � recipes/ljspeech/hifigan/train_hifigan.py � recipes/ljspeech/multiband_melgan/train_multiband_melgan.py � recipes/ljspeech/univnet/train.py � recipes/ljspeech/vits_tts/train_vits.py � recipes/ljspeech/wavegrad/train_wavegrad.py � recipes/ljspeech/wavernn/train_wavernn.py

…ill working on adding more this is what I got so far

loganhart02 · 2021-10-28T12:44:19Z

I put dataset downloaders in this pr It can be bundled with this pr but also I can make another pr for it if you want to merge it seperate from this one

v0.4.2

erogol · 2021-11-03T10:58:01Z

TTS/auto_tts/model_hub.py

+        self.epochs = epochs
+        self.manager = ModelManager()
+
+    def _single_speaker_from_pretrained(self, model_name):


It'd be easier to use the ModelManager to parse the models from .models.json so we dont need to add models manually

erogol · 2021-11-03T10:59:31Z

TTS/auto_tts/complete_recipes.py

+        self.data_path = data_path
+        self.dataset_name = dataset
+
+    def single_speaker_autotts(  # im actually going to change this to autotts_recipes and i'm making a more generic


For docstrings use this format https://numpydoc.readthedocs.io/en/latest/format.html

Even for personal notes

erogol · 2021-11-03T11:01:42Z

TTS/auto_tts/complete_recipes.py

+        trainer = Trainer(args, config, output_path, c_logger, tb_logger)
+        return trainer
+
+    def multi_speaker_autotts(


You can add ForwardTTS model too for multi-speaker

TTS/auto_tts/example.py

erogol · 2021-11-03T11:08:55Z

TTS/auto_tts/single_speaker_autotts.py

+
+
+def main():
+    parser = argparse.ArgumentParser()


For Argparsing over default values you can use Coqpit https://github.com/coqui-ai/coqpit

erogol · 2021-11-03T11:10:23Z

TTS/auto_tts/utils.py

+    # with each users data so im thinking of a way to have users define their own audio params with this
+
+
+def pick_glowtts_encoder(encoder_name: str):


Better to parse it from the code gain to prevent manual editing in the future.

erogol · 2021-11-03T11:12:41Z

TTS/tts/datasets/dataset_downloaders.py

@@ -0,0 +1,256 @@
+import logging


I don't think this needs to be a class. We can define different functions for each dataset.

You can maybe add datasets here https://github.com/coqui-ai/TTS/blob/main/TTS/utils/downloaders.py

You should also create separate PRs for changes under 🐸TTS as we move AutoTTS to a new repo

yea I had made this before I you guys added the downloaders so I'm gong to make a new PR just adding functions for the other datasets

erogol · 2021-11-03T11:17:39Z

I put some comments to your changes. Let me know if you have any questions .

stale · 2021-12-03T22:30:21Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

erogol · 2021-12-07T06:58:36Z

I close this as it is going to be a separate repo

king-dahmanus · 2022-11-25T11:30:50Z

Hey, just wanted to come here and say that auto tts is an actual tts engine for android. It basically allows for multi lingual tts by switching between tts voices. So you might need changing it

erogol and others added 5 commits July 27, 2021 10:08

Skip phoneme cache pre-compute if the path exists

88d782e

Fix json formatting error

66bcb8f

Update distribute.py (#685)

cf55494

* Update distribute.py Simple fix to make distributed training work with python runner file * Update distribute.py * Fix linter errors

Merge pull request #694 from Moldoteck/patch-1

168a9a6

Fix json formatting error

nice way to train complete recipes with three lines of code. currentl…

c6e5121

…y only made ljspeech recipes

erogol requested changes Aug 4, 2021

View reviewed changes

TTS/recipe_api/complete_recipes.py Outdated Show resolved Hide resolved

added command line training file for ljspeech tacotron2 models.

8c2a2c5

loganhart02 changed the title ~~recipe_api~~ auto tts Aug 6, 2021

loganhart02 added 2 commits August 5, 2021 22:31

added SCGlowTts to model hub

1de4995

added recipe for sc glow tts training on vctk and tacotron2 models fo…

55fbcdb

…r sam dataset, also added options so users can turn on forward and loction attention when they want to.

erogol force-pushed the dev branch from 872aec9 to 0601825 Compare August 9, 2021 18:02

erogol changed the title ~~auto tts~~ 🤖 AUTO TTS Aug 10, 2021

erogol added the feature implementation Implementation of a new feature label Aug 10, 2021

loganhart02 and others added 8 commits August 11, 2021 22:40

hifi gan and wavegrad configs for and ljspeech vocoder recipe

42bd0dc

tacotron2 config

b84301d

Merge branch 'main' of https://github.com/coqui-ai/TTS into recipe_api

ae22b23

Merge branch 'coqui-ai:main' into recipe_api

7d9029d

Merge remote-tracking branch 'origin/recipe_api' into recipe_api

807d432

made ljspeech tts trainer for all models( define models by string ins…

42fb3fd

…tead of calling a function) added more ljspeech vocoder models(these are just baselines though, once I have a free gpu im going to test them and see what can be changed to make them better.)

made ljspeech tts and vocoder trainer.

55d9fc3

vocoder command line file

68975bd

added vits(forgot to push this in my last commit)

dd80696

erogol reviewed Aug 31, 2021

View reviewed changes

loganhart02 added 2 commits October 18, 2021 11:37

added tacotron2 multispeaker model

49986d2

added pretrained model loading and ljspeech fast pitch model config. …

77dda0f

…added functions to pick forward tts encoder and decoder

loganhart02 closed this Oct 24, 2021

Merge pull request #891 from coqui-ai/dev

40c17b2

v0.4.1

loganhart02 reopened this Oct 27, 2021

loganhart02 added 3 commits October 28, 2021 08:32

This is just a class to download some public tts public datasets. st…

90cb11a

…ill working on adding more this is what I got so far

make style

450d450

erogol and others added 2 commits October 29, 2021 18:20

Merge pull request #901 from coqui-ai/dev

33aa27e

v0.4.2

Merge branch 'coqui-ai:main' into recipe_api

5d5ed65

erogol reviewed Nov 3, 2021

View reviewed changes

TTS/auto_tts/example.py Show resolved Hide resolved

erogol reviewed Nov 3, 2021

View reviewed changes

stale bot added the wontfix This will not be worked on but feel free to help. label Dec 3, 2021

erogol closed this Dec 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🤖 AUTO TTS #696

🤖 AUTO TTS #696

loganhart02 commented Aug 3, 2021

CLAassistant commented Aug 3, 2021 •

edited

Loading

erogol commented Aug 4, 2021

loganhart02 commented Aug 5, 2021

loganhart02 commented Aug 19, 2021

erogol commented Aug 23, 2021

loganhart02 commented Aug 23, 2021 via email

erogol commented Aug 25, 2021

erogol Aug 31, 2021

erogol Aug 31, 2021

erogol Aug 31, 2021

erogol Aug 31, 2021

erogol Aug 31, 2021

erogol Aug 31, 2021

erogol Aug 31, 2021

loganhart02 Sep 9, 2021

erogol commented Oct 12, 2021

erogol commented Oct 12, 2021

loganhart02 commented Oct 12, 2021 via email

erogol commented Oct 14, 2021

loganhart02 commented Oct 14, 2021 via email

erogol commented Oct 14, 2021

loganhart02 commented Oct 28, 2021

erogol Nov 3, 2021

erogol Nov 3, 2021

erogol Nov 3, 2021

erogol Nov 3, 2021

erogol Nov 3, 2021

erogol Nov 3, 2021

erogol Nov 3, 2021 •

edited

Loading

loganhart02 Nov 3, 2021

erogol commented Nov 3, 2021

stale bot commented Dec 3, 2021

erogol commented Dec 7, 2021

king-dahmanus commented Nov 25, 2022

		@@ -0,0 +1,14 @@
		from TTS.auto_tts.complete_recipes import Examples

		# with each users data so im thinking of a way to have users define their own audio params with this


		def pick_glowtts_encoder(encoder_name: str):

🤖 AUTO TTS #696

🤖 AUTO TTS #696

Conversation

loganhart02 commented Aug 3, 2021

CLAassistant commented Aug 3, 2021 • edited Loading

erogol commented Aug 4, 2021

loganhart02 commented Aug 5, 2021

loganhart02 commented Aug 19, 2021

erogol commented Aug 23, 2021

loganhart02 commented Aug 23, 2021 via email

erogol commented Aug 25, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erogol commented Oct 12, 2021

erogol commented Oct 12, 2021

loganhart02 commented Oct 12, 2021 via email

erogol commented Oct 14, 2021

loganhart02 commented Oct 14, 2021 via email

erogol commented Oct 14, 2021

loganhart02 commented Oct 28, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erogol Nov 3, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erogol commented Nov 3, 2021

stale bot commented Dec 3, 2021

erogol commented Dec 7, 2021

king-dahmanus commented Nov 25, 2022

CLAassistant commented Aug 3, 2021 •

edited

Loading

erogol Nov 3, 2021 •

edited

Loading