Skip to content

Commit

Permalink
VITS HiFiTTS doc (NVIDIA#6288)
Browse files Browse the repository at this point in the history
* Added VITS documentation
* Typos
* Added experimental note
* Updated tutorial
* Added spectrogram visualization
* Updated ipa_cmudict version

---------

Signed-off-by: Evgeny Shabalin <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: hsiehjackson <[email protected]>
  • Loading branch information
2 people authored and hsiehjackson committed Jun 2, 2023
1 parent 2732e2f commit fa333f0
Show file tree
Hide file tree
Showing 4 changed files with 482 additions and 3 deletions.
3 changes: 2 additions & 1 deletion docs/source/tts/data/ngc_models_e2e.csv
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
Locale,Model Name,Dataset,Sampling Rate,#Spk,Phoneme Unit,Model Class,Overview,Checkpoint
en-US,tts_en_lj_vits,LJSpeech,22050Hz,1,IPA,nemo.collections.tts.models.vits.VitsModel,`tts_en_lj_vits <https://ngc.nvidia.com/catalog/models/nvidia:nemo:tts_en_lj_vits>`_,``https://api.ngc.nvidia.com/v2/models/nvidia/nemo/tts_en_lj_vits/versions/1.13.0/files/vits_ljspeech_fp16_full.nemo``
en-US,tts_en_lj_vits,LJSpeech,22050Hz,1,IPA,nemo.collections.tts.models.vits.VitsModel,`tts_en_lj_vits <https://ngc.nvidia.com/catalog/models/nvidia:nemo:tts_en_lj_vits>`_,``https://api.ngc.nvidia.com/v2/models/nvidia/nemo/tts_en_lj_vits/versions/1.13.0/files/vits_ljspeech_fp16_full.nemo``
en-US,tts_en_hifitts_vits,HiFiTTS,44100Hz,10,IPA,nemo.collections.tts.models.vits.VitsModel,`tts_en_hifitts_vits <https://ngc.nvidia.com/catalog/models/nvidia:nemo:tts_en_hifitts_vits>`_,``https://api.ngc.nvidia.com/v2/models/nvidia/nemo/tts_en_hifitts_vits/versions/r1.15.0/files/vits_en_hifitts.nemo``
3 changes: 1 addition & 2 deletions examples/tts/vits.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@

import pytorch_lightning as pl

from nemo.collections.common.callbacks import LogEpochTimeCallback
from nemo.collections.tts.models.vits import VitsModel
from nemo.core.config import hydra_runner
from nemo.utils.exp_manager import exp_manager
Expand All @@ -26,7 +25,7 @@ def main(cfg):
exp_manager(trainer, cfg.get("exp_manager", None))
model = VitsModel(cfg=cfg.model, trainer=trainer)

trainer.callbacks.extend([pl.callbacks.LearningRateMonitor(), LogEpochTimeCallback()])
trainer.callbacks.extend([pl.callbacks.LearningRateMonitor()])
trainer.fit(model)


Expand Down
7 changes: 7 additions & 0 deletions nemo/collections/tts/models/vits.py
Original file line number Diff line number Diff line change
Expand Up @@ -394,6 +394,13 @@ def list_available_models(cls) -> 'List[PretrainedModelInfo]':
class_=cls,
)
list_of_models.append(model)
model = PretrainedModelInfo(
pretrained_model_name="tts_en_hifitts_vits",
location="https://api.ngc.nvidia.com/v2/models/nvidia/nemo/tts_en_hifitts_vits/versions/r1.15.0/files/vits_en_hifitts.nemo",
description="This model is trained on HiFITTS sampled at 44100Hz with and can be used to generate male and female English voices with an American accent.",
class_=cls,
)
list_of_models.append(model)
return list_of_models

@typecheck(
Expand Down
Loading

0 comments on commit fa333f0

Please sign in to comment.