Skip to content

Commit

Permalink
Add some info about FastPitch SSL model (NVIDIA#5994)
Browse files Browse the repository at this point in the history
Signed-off-by: Jocelyn Huang <[email protected]>
  • Loading branch information
redoctopus authored and titu1994 committed Mar 24, 2023
1 parent f4607e7 commit 7082446
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion docs/source/tts/models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,11 @@ Tacotron 2 consists of a recurrent sequence-to-sequence feature prediction netwo
:scale: 30%


SSL FastPitch
~~~~~~~~~~~~~
This **experimental** version of FastPitch takes in content and speaker embeddings generated by an SSL Disentangler and generates mel-spectrograms, with the goal that voice characteristics are taken from the speaker embedding while the content of speech is determined by the content embedding. Voice conversion can be done using this model by swapping the speaker embedding input to that of a target speaker, while keeping the content embedding the same. More details to come.


Vocoders
--------

Expand Down Expand Up @@ -110,4 +115,4 @@ References
.. bibliography:: tts_all.bib
:style: plain
:labelprefix: TTS-MODELS
:keyprefix: tts-models-
:keyprefix: tts-models-

0 comments on commit 7082446

Please sign in to comment.