[TTS] Implement new vocoder dataset #6670

rlangman · 2023-05-17T22:37:08Z

What does this PR do ?

Implement new VocoderDataset to support multi-manifest training, which follows the same design pattern as the new TextToSpeechDataset (#6575).

Collection: [TTS]

Changelog

Add VocoderDataset implementation
Add example config for using data loader
Add logic in HiFi-GAN recipe for switching to new data loader (matching changes to support UnivNet recipe will be in a future PR).
Add vocoder logging callback
Use modified configure_optimizers() implementation from UnivNet recipe (https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/tts/models/univnet.py#L106) to support schedulers other than CosineAnnealing. New config uses scheduler from original HiFi-GAN paper.

Usage

Example of training on 2 datasets, LJSpeech with 10% training examples and VCTK with 90%:

/home/rlangman/miniconda3/envs/nemo_dev/bin/python /home/rlangman/Code/NeMo/examples/tts/hifigan.py
--config-path=/home/rlangman/Code/NeMo/examples/tts/conf/hifigan
--config-dir=/home/rlangman/Code/NeMo/examples/tts/conf
--config-name=hifigan_22050_data.yaml
sample=22050
max_epochs=1000
weighted_sample_steps=1000
batch_size=16
trainer.devices=1
log_dir=/home/rlangman/exps/ljspeech_vctk/HifiGan/test/samples
exp_manager.exp_dir=/home/rlangman/exps/ljspeech_vctk
+exp_manager.version=test
+train_ds_meta.ljspeech.manifest_path=/home/rlangman/fs_mount/data/ljspeech/train_manifest.json
+train_ds_meta.ljspeech.audio_dir=/home/rlangman/fs_mount/data/ljspeech/audio
+train_ds_meta.ljspeech.sample_weight=0.1
+val_ds_meta.ljspeech.manifest_path=/home/rlangman/fs_mount/data/ljspeech/val_manifest.json
+val_ds_meta.ljspeech.audio_dir=/home/rlangman/fs_mount/data/ljspeech/audio
+log_ds_meta.ljspeech._target_=nemo.collections.tts.data.vocoder_dataset.DatasetMeta
+log_ds_meta.ljspeech.manifest_path=/home/rlangman/fs_mount/data/ljspeech/log_manifest.json
+log_ds_meta.ljspeech.audio_dir=/home/rlangman/fs_mount/data/ljspeech/audio
+train_ds_meta.vctk.manifest_path=/home/rlangman/fs_mount/data/vctk/train_manifest.json
+train_ds_meta.vctk.audio_dir=/home/rlangman/fs_mount/data/vctk/audio
+train_ds_meta.vctk.sample_weight=0.9
+val_ds_meta.vctk.manifest_path=/home/rlangman/fs_mount/data/vctk/val_manifest.json
+val_ds_meta.vctk.audio_dir=/home/rlangman/fs_mount/data/vctk/audio
+log_ds_meta.vctk._target_=nemo.collections.tts.data.vocoder_dataset.DatasetMeta
+log_ds_meta.vctk.manifest_path=/home/rlangman/fs_mount/data/vctk/log_manifest.json
+log_ds_meta.vctk.audio_dir=/home/rlangman/fs_mount/data/vctk/audio
exp_manager.create_wandb_logger=true
+exp_manager.wandb_logger_kwargs.project=hifigan_ljspeech_vctk
+exp_manager.wandb_logger_kwargs.name=test

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

examples/tts/conf/hifigan/hifigan_22050_data.yaml

nemo/collections/tts/models/hifigan.py

nemo/collections/tts/parts/utils/callbacks.py

nemo/collections/tts/data/vocoder_dataset.py

nemo/collections/tts/data/text_to_speech_dataset.py

examples/tts/conf/hifigan/hifigan_data.yaml

racoiaws · 2023-05-29T14:41:44Z

examples/tts/conf/hifigan/hifigan_data.yaml

+model:
+
+  max_epochs: ${max_epochs}
+  steps_per_epoch: ${weighted_sample_steps}


What does weighted_sample_steps mean?

Dataloader docstring: Optional int, If provided, then data will be sampled (with replacement) based on the sample weights provided in the dataset metadata. If None, then sample weights will be ignored.

In other words, with weighted sampling (and with large datasets in general) the standard definition of an epoch no longer works so you refer to an epoch as a fixed number of training steps.

The name is not as clear as I would like, but alternatives I thought of were either too long (like weighted_sampling_steps_per_epoch) or required breaking it into 2 arguments like:

WeightedSamplingConfig: use_weighted_sampling: true steps_per_epoch: 1000

Where steps_per_epoch does nothing if use_weighted_sampling: false.

Thanks

In other words, with weighted sampling (and with large datasets in general) the standard definition of an epoch no longer works so you refer to an epoch as a fixed number of training steps.

Can't really agree, I always thought of epoch as of something dependent on size of a specific dataset. Number of training examples in the dataset still works, with the caveat that some samples might not have been utilized in a given epoch run.

(Not a blocker)

but alternatives I thought of were either too long (like weighted_sampling_steps_per_epoch)

True, this is not something that can be expressed consisely. But I see no harm in a long-named argument. I would even suggest steps_per_epoch_if_using_weighted_sampling

Do you thing your could you rename this into a longer and more descriptive version?

Can't really agree, I always thought of epoch as of something dependent on size of a specific dataset. Number of training examples in the dataset still works, with the caveat that some samples might not have been utilized in a given epoch run.

The main issue is that as datasets become large having epoch equal to dataset size means it may take days to complete 1 epoch, which is impractical when you only run validation and save checkpoints at the end of an epoch. This is a problem right now with multi-language ASR training which NeMo provides no good solution for.

So while tracking epochs is nice, you usually want things like checkpointing to happen at fixed intervals (eg. every few hours).

Do you thing your could you rename this into a longer and more descriptive version?

Sure.

I also tried adding in alternate config which could optionally set epoch_size = dataset_size (even if not all training examples are seen each epoch), but the way that an epoch is defined for distributed training is relatively complicated (https://github.com/NVIDIA/NeMo/blob/main/nemo/core/optim/lr_scheduler.py#L934-L936). Getting it wrong will create a mismatch with the LR scheduler. So I would rather not support it for now unless there is an urgent need to.

nemo/collections/tts/data/text_to_speech_dataset.py

racoiaws · 2023-06-02T14:36:38Z

Apologies if this is taking too long, after resolving two threads above this PR should be good to go.

Signed-off-by: Ryan <[email protected]>

* peft eval directly from ckpt (#6785) * update to load from ckpt Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> * load ckpt peft model Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update style Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add Frame-VAD examples and utils (#6463) * add model, dataset, necessary utils and tests Signed-off-by: stevehuang52 <[email protected]> * fix tarred data Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * add fvad examples and update utils Signed-off-by: stevehuang52 <[email protected]> * add copyright Signed-off-by: stevehuang52 <[email protected]> * refactor and add tests Signed-off-by: stevehuang52 <[email protected]> * update dataset Signed-off-by: stevehuang52 <[email protected]> * update test Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * fix typos Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Taejin Park <[email protected]> * [TTS][zh] refine hardcoded lowercase for ASCII letters. (#6781) Signed-off-by: Xuesong Yang <[email protected]> * Spellchecking ASR customization model (#6179) * bug fixes Signed-off-by: Alexandra Antonova <[email protected]> * fix bugs, add preparation and evaluation scripts, add readme Signed-off-by: Alexandra Antonova <[email protected]> * small fixes Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add real coverage calculation, small fixes, more debug information Signed-off-by: Alexandra Antonova <[email protected]> * add option to pass a filelist and output folder - to handle inference from multiple input files Signed-off-by: Alexandra Antonova <[email protected]> * added preprocessing for yago wikipedia articles - finding yago entities and their subphrases Signed-off-by: Alexandra Antonova <[email protected]> * yago wiki preprocessing, sampling, pseudonormalization Signed-off-by: Alexandra Antonova <[email protected]> * more scripts for preparation of training examples Signed-off-by: Alexandra Antonova <[email protected]> * bug fixes Signed-off-by: Alexandra Antonova <[email protected]> * add some alphabet checks Signed-off-by: Alexandra Antonova <[email protected]> * add bert on subwords, concatenate it to bert on characters Signed-off-by: Alexandra Antonova <[email protected]> * add calculation of character_pos_to_subword_pos Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * pdb Signed-off-by: Alexandra Antonova <[email protected]> * tensor join bug fix Signed-off-by: Alexandra Antonova <[email protected]> * double hidden_size in classifier Signed-off-by: Alexandra Antonova <[email protected]> * pdb Signed-off-by: Alexandra Antonova <[email protected]> * default index value 0 instead of -1 because index cannot be negative Signed-off-by: Alexandra Antonova <[email protected]> * pad index value 0 instead of -1 because index cannot be negative Signed-off-by: Alexandra Antonova <[email protected]> * remove pdb Signed-off-by: Alexandra Antonova <[email protected]> * fix bugs, add creation of tarred dataset Signed-off-by: Alexandra Antonova <[email protected]> * add possibility to change sequence len at inference Signed-off-by: Alexandra Antonova <[email protected]> * change sampling of dummy candidates at inference, add candidate info file Signed-off-by: Alexandra Antonova <[email protected]> * fix import Signed-off-by: Alexandra Antonova <[email protected]> * fix bug Signed-off-by: Alexandra Antonova <[email protected]> * update transcription now uses info Signed-off-by: Alexandra Antonova <[email protected]> * write path Signed-off-by: Alexandra Antonova <[email protected]> * 1. add tarred dataset support(untested). 2. fix bug with ban_ngrams in indexing Signed-off-by: Alexandra Antonova <[email protected]> * skip short_sent if no real candidates Signed-off-by: Alexandra Antonova <[email protected]> * fix import Signed-off-by: Alexandra Antonova <[email protected]> * add braceexpand Signed-off-by: Alexandra Antonova <[email protected]> * fixes Signed-off-by: Alexandra Antonova <[email protected]> * fix bug Signed-off-by: Alexandra Antonova <[email protected]> * fix bug Signed-off-by: Alexandra Antonova <[email protected]> * fix bug in np.ones Signed-off-by: Alexandra Antonova <[email protected]> * fix bug in collate Signed-off-by: Alexandra Antonova <[email protected]> * change tensor type to long because of error in torch.gather Signed-off-by: Alexandra Antonova <[email protected]> * fix for empty spans tensor Signed-off-by: Alexandra Antonova <[email protected]> * same fixes in _collate_fn for tarred dataset Signed-off-by: Alexandra Antonova <[email protected]> * fix bug from previous commit Signed-off-by: Alexandra Antonova <[email protected]> * change int types to be shorter to minimize tar size Signed-off-by: Alexandra Antonova <[email protected]> * refactoring of datasets and inference Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * tar by 100k examples, small fixes Signed-off-by: Alexandra Antonova <[email protected]> * small fixes, add analytics script Signed-off-by: Alexandra Antonova <[email protected]> * Add functions for dynamic programming comparison to get best path by ngrams Signed-off-by: Alexandra Antonova <[email protected]> * fixes Signed-off-by: Alexandra Antonova <[email protected]> * small fix Signed-off-by: Alexandra Antonova <[email protected]> * fixes to support testing on SPGISpeech Signed-off-by: Alexandra Antonova <[email protected]> * add preprocessing for userlibri Signed-off-by: Alexandra Antonova <[email protected]> * some refactoring Signed-off-by: Alexandra Antonova <[email protected]> * some refactoring Signed-off-by: Alexandra Antonova <[email protected]> * move some functions to utils to reuse from other project Signed-off-by: Alexandra Antonova <[email protected]> * move some functions to utils to reuse from other project Signed-off-by: Alexandra Antonova <[email protected]> * move some functions to utils to reuse from other project Signed-off-by: Alexandra Antonova <[email protected]> * small refactoring before pr. Add bash-scripts reproducing evaluation Signed-off-by: Alexandra Antonova <[email protected]> * style fix Signed-off-by: Alexandra Antonova <[email protected]> * small fixes in inference Signed-off-by: Alexandra Antonova <[email protected]> * bug fix - didn't move window on last symbol Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug - shuffle was before truncation of sorted candidates Signed-off-by: Alexandra Antonova <[email protected]> * refactoring, fix some bugs Signed-off-by: Alexandra Antonova <[email protected]> * variour fixes. Add word_indices at inference Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add candidate positions Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Move data preparation and evaluation to other repo Signed-off-by: Alexandra Antonova <[email protected]> * add infer_reproduce_paper. Refactoring Signed-off-by: Alexandra Antonova <[email protected]> * refactor inference using fragment indices Signed-off-by: Alexandra Antonova <[email protected]> * add some helper functions Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug with parameters order Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bugs Signed-off-by: Alexandra Antonova <[email protected]> * refactoring, fix bug Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add multiple variants of adjusting start/end positions Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more fixes Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unit tests, other fixes Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Alexandra Antonova <[email protected]> * fix CodeQl warnings Signed-off-by: Alexandra Antonova <[email protected]> * bug fixes Signed-off-by: Alexandra Antonova <[email protected]> * fix bugs, add preparation and evaluation scripts, add readme Signed-off-by: Alexandra Antonova <[email protected]> * small fixes Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add real coverage calculation, small fixes, more debug information Signed-off-by: Alexandra Antonova <[email protected]> * add option to pass a filelist and output folder - to handle inference from multiple input files Signed-off-by: Alexandra Antonova <[email protected]> * added preprocessing for yago wikipedia articles - finding yago entities and their subphrases Signed-off-by: Alexandra Antonova <[email protected]> * yago wiki preprocessing, sampling, pseudonormalization Signed-off-by: Alexandra Antonova <[email protected]> * more scripts for preparation of training examples Signed-off-by: Alexandra Antonova <[email protected]> * bug fixes Signed-off-by: Alexandra Antonova <[email protected]> * add some alphabet checks Signed-off-by: Alexandra Antonova <[email protected]> * add bert on subwords, concatenate it to bert on characters Signed-off-by: Alexandra Antonova <[email protected]> * add calculation of character_pos_to_subword_pos Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * pdb Signed-off-by: Alexandra Antonova <[email protected]> * tensor join bug fix Signed-off-by: Alexandra Antonova <[email protected]> * double hidden_size in classifier Signed-off-by: Alexandra Antonova <[email protected]> * pdb Signed-off-by: Alexandra Antonova <[email protected]> * default index value 0 instead of -1 because index cannot be negative Signed-off-by: Alexandra Antonova <[email protected]> * pad index value 0 instead of -1 because index cannot be negative Signed-off-by: Alexandra Antonova <[email protected]> * remove pdb Signed-off-by: Alexandra Antonova <[email protected]> * fix bugs, add creation of tarred dataset Signed-off-by: Alexandra Antonova <[email protected]> * add possibility to change sequence len at inference Signed-off-by: Alexandra Antonova <[email protected]> * change sampling of dummy candidates at inference, add candidate info file Signed-off-by: Alexandra Antonova <[email protected]> * fix import Signed-off-by: Alexandra Antonova <[email protected]> * fix bug Signed-off-by: Alexandra Antonova <[email protected]> * update transcription now uses info Signed-off-by: Alexandra Antonova <[email protected]> * write path Signed-off-by: Alexandra Antonova <[email protected]> * 1. add tarred dataset support(untested). 2. fix bug with ban_ngrams in indexing Signed-off-by: Alexandra Antonova <[email protected]> * skip short_sent if no real candidates Signed-off-by: Alexandra Antonova <[email protected]> * fix import Signed-off-by: Alexandra Antonova <[email protected]> * add braceexpand Signed-off-by: Alexandra Antonova <[email protected]> * fixes Signed-off-by: Alexandra Antonova <[email protected]> * fix bug Signed-off-by: Alexandra Antonova <[email protected]> * fix bug Signed-off-by: Alexandra Antonova <[email protected]> * fix bug in np.ones Signed-off-by: Alexandra Antonova <[email protected]> * fix bug in collate Signed-off-by: Alexandra Antonova <[email protected]> * change tensor type to long because of error in torch.gather Signed-off-by: Alexandra Antonova <[email protected]> * fix for empty spans tensor Signed-off-by: Alexandra Antonova <[email protected]> * same fixes in _collate_fn for tarred dataset Signed-off-by: Alexandra Antonova <[email protected]> * fix bug from previous commit Signed-off-by: Alexandra Antonova <[email protected]> * change int types to be shorter to minimize tar size Signed-off-by: Alexandra Antonova <[email protected]> * refactoring of datasets and inference Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * bug fix Signed-off-by: Alexandra Antonova <[email protected]> * tar by 100k examples, small fixes Signed-off-by: Alexandra Antonova <[email protected]> * small fixes, add analytics script Signed-off-by: Alexandra Antonova <[email protected]> * Add functions for dynamic programming comparison to get best path by ngrams Signed-off-by: Alexandra Antonova <[email protected]> * fixes Signed-off-by: Alexandra Antonova <[email protected]> * small fix Signed-off-by: Alexandra Antonova <[email protected]> * fixes to support testing on SPGISpeech Signed-off-by: Alexandra Antonova <[email protected]> * add preprocessing for userlibri Signed-off-by: Alexandra Antonova <[email protected]> * some refactoring Signed-off-by: Alexandra Antonova <[email protected]> * some refactoring Signed-off-by: Alexandra Antonova <[email protected]> * move some functions to utils to reuse from other project Signed-off-by: Alexandra Antonova <[email protected]> * move some functions to utils to reuse from other project Signed-off-by: Alexandra Antonova <[email protected]> * move some functions to utils to reuse from other project Signed-off-by: Alexandra Antonova <[email protected]> * small refactoring before pr. Add bash-scripts reproducing evaluation Signed-off-by: Alexandra Antonova <[email protected]> * style fix Signed-off-by: Alexandra Antonova <[email protected]> * small fixes in inference Signed-off-by: Alexandra Antonova <[email protected]> * bug fix - didn't move window on last symbol Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug - shuffle was before truncation of sorted candidates Signed-off-by: Alexandra Antonova <[email protected]> * refactoring, fix some bugs Signed-off-by: Alexandra Antonova <[email protected]> * variour fixes. Add word_indices at inference Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add candidate positions Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Move data preparation and evaluation to other repo Signed-off-by: Alexandra Antonova <[email protected]> * add infer_reproduce_paper. Refactoring Signed-off-by: Alexandra Antonova <[email protected]> * refactor inference using fragment indices Signed-off-by: Alexandra Antonova <[email protected]> * add some helper functions Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug with parameters order Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bugs Signed-off-by: Alexandra Antonova <[email protected]> * refactoring, fix bug Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add multiple variants of adjusting start/end positions Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more fixes Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unit tests, other fixes Signed-off-by: Alexandra Antonova <[email protected]> * fix Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix CodeQl warnings Signed-off-by: Alexandra Antonova <[email protected]> * add script for full inference pipeline, refactoring Signed-off-by: Alexandra Antonova <[email protected]> * add tutorial Signed-off-by: Alexandra Antonova <[email protected]> * take example data from HuggingFace Signed-off-by: Alexandra Antonova <[email protected]> * add docs Signed-off-by: Alexandra Antonova <[email protected]> * fix comment Signed-off-by: Alexandra Antonova <[email protected]> * fix bug Signed-off-by: Alexandra Antonova <[email protected]> * small fixes for PR Signed-off-by: Alexandra Antonova <[email protected]> * add some more tests Signed-off-by: Alexandra Antonova <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * try to fix tests adding with_downloads Signed-off-by: Alexandra Antonova <[email protected]> * skip tests with tokenizer download Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: Alexandra Antonova <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [TTS] Implement new vocoder dataset (#6670) * [TTS] Implement new vocoder dataset Signed-off-by: Ryan <[email protected]> * [TTS] Redo config structure, minor fixes Signed-off-by: Ryan <[email protected]> * [TTS] Fix alignment logging Signed-off-by: Ryan <[email protected]> * [TTS] Fix script usage example Signed-off-by: Ryan <[email protected]> * [TTS] Fixed epoch LR scheduling Signed-off-by: Ryan <[email protected]> * [TTS] Support .nemo checkpoint in FP callback Signed-off-by: Ryan <[email protected]> * [TTS] Remove align interpolator Signed-off-by: Ryan <[email protected]> * [TTS] Remove HiFi-GAN defaults list interpolation Signed-off-by: Ryan <[email protected]> * [TTS] Rename weighted_sample_steps to weighted_sampling_steps_per_epoch Signed-off-by: Ryan <[email protected]> --------- Signed-off-by: Ryan <[email protected]> * GPT inference long context (#6687) * deb infer Signed-off-by: Evelina <[email protected]> * deb infer Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * clean up Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * dont do maxlen trunc for non abs pos emb Signed-off-by: Evelina <[email protected]> * dont do maxlen trunc for non abs pos emb Signed-off-by: Evelina <[email protected]> * convert for training only Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add eval test, add save .nemo for sft model Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * jenkins format fix Signed-off-by: Evelina <[email protected]> * update jenkins Signed-off-by: Evelina <[email protected]> * update jenkins Signed-off-by: Evelina <[email protected]> * fix jenkins Signed-off-by: Evelina <[email protected]> * remove test, ci timeout Signed-off-by: Evelina <[email protected]> * fix for m_gpt_eval.py Signed-off-by: Evelina <[email protected]> * jenkins test Signed-off-by: Evelina <[email protected]> * fix gpt_eval with sft model Signed-off-by: Evelina <[email protected]> * revert jenkins Signed-off-by: Evelina <[email protected]> * keep float conversion for model.generate() Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix inference dtype Signed-off-by: Evelina <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * TDT model pull request (#6536) * TDT model pull request, initial draft Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * TDT PR WIP Signed-off-by: Hainan Xu <[email protected]> * TDT PR WIP Signed-off-by: Hainan Xu <[email protected]> * TDT PR WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * TDT WIP Signed-off-by: Hainan Xu <[email protected]> * addressed some review comments, part1 Signed-off-by: Hainan Xu <[email protected]> * addressed some review comments, part1, one line fix Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add tests for comparing TDT alphas with pytorch VS kernel computation Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add tests for comparing multiblank alphas with pytorch VS kernel computation Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add tests for fixed case computation for TDT Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add more comments for greedy-batch decoding for TDT Signed-off-by: Hainan Xu <[email protected]> * include config for TDT model with stateless decoders Signed-off-by: Hainan Xu <[email protected]> * add reference to TDT in Readme Signed-off-by: Hainan Xu <[email protected]> * slight modification of config file comments Signed-off-by: Hainan Xu <[email protected]> * addressed more comments Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more detailed comments for tdt kernel Signed-off-by: Hainan Xu <[email protected]> * one line fix Signed-off-by: Hainan Xu <[email protected]> * fixed small bug that results in test fails for rnnt_decoding Signed-off-by: Hainan Xu <[email protected]> * fixed small bug that results in test fails for rnnt_decoding Signed-off-by: Hainan Xu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed small bug that results in test fails for rnnt_decoding Signed-off-by: Hainan Xu <[email protected]> * remove unused import Signed-off-by: Hainan Xu <[email protected]> --------- Signed-off-by: Hainan Xu <[email protected]> Co-authored-by: Hainan Xu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix get_parameters when using main params optimizer (#6764) (#6787) * fix get param * change name --------- Signed-off-by: ericharper <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Lddl bert (#6761) (#6790) * initial POC for LDDL Bert * Finish LDDL POC * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * address comments * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix merge head * resolving merge * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add support for val/test loaders * change to new LDDL class + add winding * fix logging level * fix winding * test fix * fixes to winding * add file system * add prepemption optimizations * more logging * more prints * better logging * asfsf * add barrier * removing prints * working with mb lddl loader * final changes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update requirements file with LDDL * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert adding to requirements --------- Signed-off-by: wdykas <[email protected]> Co-authored-by: wdykas <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Eric Harper <[email protected]> * Fix check (#6798) (#6800) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fix validation with drop_last=False (#6704) Signed-off-by: Mikołaj Błaż <[email protected]> Co-authored-by: Eric Harper <[email protected]> * SDE unt lvl comparison (#6669) Added a visual utterance-level comparison of two ASR models Signed-off-by: George <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Debug Transformer Engine FP8 support with Megatron-core infrastructure (#6791) * Construct FP8 amax reduction group Signed-off-by: Tim Moon <[email protected]> * Update Megatron-core version in CI Signed-off-by: Tim Moon <[email protected]> --------- Signed-off-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> * Lora/PEFT training script CI test (#6664) * new lora test Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * check for chat Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> * small train set Signed-off-by: arendu <[email protected]> * update Signed-off-by: arendu <[email protected]> * precision change Signed-off-by: arendu <[email protected]> * fixed typo in paths Signed-off-by: arendu <[email protected]> * full data with limit val batches Signed-off-by: arendu <[email protected]> * tp2 instead of pp2 Signed-off-by: arendu <[email protected]> * tp2 instead of pp2 Signed-off-by: arendu <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: Adi Renduchintala <[email protected]> * change branch to main, small fix (#6803) Signed-off-by: Alexandra Antonova <[email protected]> * add call to p2p overlap (#6779) (#6786) * add call to p2p overlap * update Jenkins for test --------- Signed-off-by: Abhinav Khattar <[email protected]> Signed-off-by: Eric Harper <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * fixed decor to show messages only when the wrapped object is called. (#6793) Signed-off-by: Xuesong Yang <[email protected]> * Bug fix for reset_sequence_parallel_args (#6802) (#6805) Signed-off-by: Markel Sanz Ausin <[email protected]> Co-authored-by: Markel Sanz Ausin <[email protected]> * text_generation_utils memory reduction if no logprob needed (#6773) * repro for gpt eval mp mem issue Signed-off-by: Yang Zhang <[email protected]> * add print statements for memory allocation Signed-off-by: Yang Zhang <[email protected]> * adjusted hot fix that prevents softmax on the entire output embedding,now memory bottlenecked by attention softmax which needs to be solved with FA or long attention Signed-off-by: Yang Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * using compute_logprob to configure inference Signed-off-by: Yang Zhang <[email protected]> * enable compute logprob for peft Signed-off-by: Yang Zhang <[email protected]> * remove print statements Signed-off-by: Yang Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix ci Signed-off-by: Yang Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added docstrings Signed-off-by: Yang Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing config Signed-off-by: Yang Zhang <[email protected]> * remove truncate prompt length feature Signed-off-by: Yang Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * tensor before all gather needs to be contiguous Signed-off-by: Yang Zhang <[email protected]> --------- Signed-off-by: Yang Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Evelina <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Fixed bug in MaskedSpecAug that overestimates samples. (#6775) Signed-off-by: tbartley94 <[email protected]> * update core version (#6817) (#6819) Signed-off-by: Abhinav Khattar <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> * lora pp2 (#6818) Signed-off-by: arendu <[email protected]> * Add optional index mapping dir in mmap text datasets (#6683) If datasets are stored on a read-only medium, index files cannot be created into adjacent files and an alternative directory must be specified for index mapping files. This commit adds an optional `index_mapping_dir` to the constructors. Unit tests are also added. [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Update path formatting for relative paths Signed-off-by: Greg Heinrich <[email protected]> * Add inference kv cache support for transformer TE path (#6627) * Add kv cache support for transformer TE path Signed-off-by: Yen-Shi Wang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Mark get_data_parallel_group as WAR Signed-off-by: Yen-Shi Wang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Initialize process group for FP8 training Signed-off-by: Tim Moon <[email protected]> * Update Megatron GPT eval script for non-FP8 path Signed-off-by: Yen-Shi Wang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yen-Shi Wang <[email protected]> Signed-off-by: Tim Moon <[email protected]> Signed-off-by: Yen-Shi Wang <[email protected]> Co-authored-by: Yen-Shi Wang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tim Moon <[email protected]> Co-authored-by: Tim Moon <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Support large inputs to Conformer and Fast Conformer (#6556) * initial commit Signed-off-by: Dima Rekesh <[email protected]> * typos Signed-off-by: Dima Rekesh <[email protected]> * tweaks to padding Signed-off-by: Dima Rekesh <[email protected]> * comments Signed-off-by: Dima Rekesh <[email protected]> * attempt at first working version Signed-off-by: Dima Rekesh <[email protected]> * typos and fixed p calculation Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing merge artifacts Signed-off-by: Dima Rekesh <[email protected]> * typo Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unnessary imports Signed-off-by: Dima Rekesh <[email protected]> * if batch split succeeded no need to conv again Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding channel wise split Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding reference to pytorch issue 80020 Signed-off-by: Dima Rekesh <[email protected]> * removing time chunking methods Signed-off-by: Dima Rekesh <[email protected]> * accounting for the actual self._stride value Signed-off-by: Dima Rekesh <[email protected]> * limiting the fix to dw_striding subsampling Signed-off-by: Dima Rekesh <[email protected]> * renamed methods Signed-off-by: Dima Rekesh <[email protected]> * one more accounting for the actual self._stride value Signed-off-by: Dima Rekesh <[email protected]> * support for causal convs Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * option to set conv chunking size manually * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixing imports * subsampling test Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rename variable Signed-off-by: Dima Rekesh <[email protected]> * imports in test Signed-off-by: Dima Rekesh <[email protected]> * more runtime checks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * a more careful test Signed-off-by: Dima Rekesh <[email protected]> * bug in causal Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix in causal Signed-off-by: Dima Rekesh <[email protected]> * change_conv_chunking_factor methods Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * renamed methods Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disabling chunking by default Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * typo Signed-off-by: Dima Rekesh <[email protected]> * changing default chunking to auto Signed-off-by: Dima Rekesh <[email protected]> * only split if needed Signed-off-by: Dima Rekesh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * only split if needed Signed-off-by: Dima Rekesh <[email protected]> --------- Signed-off-by: Dima Rekesh <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * sharded_manifests updated docs (#6833) Signed-off-by: Dima Rekesh <[email protected]> * added fc-xl, xxl and titanet-s models (#6832) Signed-off-by: Nithin Rao Koluguri <nithinraok> Co-authored-by: Nithin Rao Koluguri <nithinraok> * add reference to our paper (#6821) * add reference to our paper Signed-off-by: Alexandra Antonova <[email protected]> * add paper reference to docs Signed-off-by: Alexandra Antonova <[email protected]> --------- Signed-off-by: Alexandra Antonova <[email protected]> * Upperbound Numpy to < 1.24 (#6829) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Multi-lookahead cache-aware streaming models (#6711) * added methods. Signed-off-by: Vahid <[email protected]> * added methods. Signed-off-by: Vahid <[email protected]> * added initial code. Signed-off-by: Vahid <[email protected]> * added initial code. Signed-off-by: Vahid <[email protected]> * added initial code. Signed-off-by: Vahid <[email protected]> * added config files. Signed-off-by: Vahid <[email protected]> * fixed bugs. Signed-off-by: Vahid <[email protected]> * updated confs. Signed-off-by: Vahid <[email protected]> * updated confs. Signed-off-by: Vahid <[email protected]> * updated confs. Signed-off-by: Vahid <[email protected]> * updated confs. Signed-off-by: Vahid <[email protected]> * improved f.conv1d Signed-off-by: Vahid <[email protected]> * pulled from main. Signed-off-by: Vahid <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * pulled from main. Signed-off-by: Vahid <[email protected]> * added postpostnorm. Signed-off-by: Vahid <[email protected]> * fixed the target continiouse bug. Signed-off-by: Vahid <[email protected]> * added dw_striding causal. Signed-off-by: Vahid <[email protected]> * added print for debugging. Signed-off-by: Vahid <[email protected]> * added print for debugging. Signed-off-by: Vahid <[email protected]> * fixed causal convolutions. Signed-off-by: Vahid <[email protected]> * added _midnorm. Signed-off-by: Vahid <[email protected]> * fixed transcribe. Signed-off-by: Vahid <[email protected]> * cleaned code. Signed-off-by: Vahid <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * moved back configs. Signed-off-by: Vahid <[email protected]> * moved back configs. Signed-off-by: Vahid <[email protected]> * updated fast emit for FC models. Signed-off-by: Vahid <[email protected]> * updated fast emit for FC models. Signed-off-by: Vahid <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed bug. Signed-off-by: Vahid <[email protected]> * fixed bug and addressed comments. Signed-off-by: Vahid <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed configs. Signed-off-by: Vahid <[email protected]> * fixed configs. Signed-off-by: Vahid <[email protected]> * dropped the test. Signed-off-by: Vahid <[email protected]> --------- Signed-off-by: Vahid <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * added changes to ramp up bs (#6799) * rampup bs changes Signed-off-by: dimapihtar <[email protected]> * rampup bs changes Signed-off-by: dimapihtar <[email protected]> * fixed styling Signed-off-by: dimapihtar <[email protected]> * fix bug Signed-off-by: Dmytro Pykhtar <[email protected]> --------- Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Fix typo in core.rst (#6838) Signed-off-by: Dounx <[email protected]> * add back ptuning pp2 test (#6394) Signed-off-by: arendu <[email protected]> * t5 lora tuning (#6612) * t5 lora Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * eval lora t5 Signed-off-by: arendu <[email protected]> * adjust differernt lora dims Signed-off-by: arendu <[email protected]> * minor changes Signed-off-by: David Mosallanezhad <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bugfix for state_dict Signed-off-by: David Mosallanezhad <[email protected]> --------- Signed-off-by: arendu <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: David Mosallanezhad <[email protected]> Co-authored-by: David <[email protected]> * NFA updates (#6695) * update V_NEGATIVE_NUM constant to make better use of torch.float32 range Signed-off-by: Elena Rastorgueva <[email protected]> * adjust backpointers dtype if U_max too large Signed-off-by: Elena Rastorgueva <[email protected]> * Remove print statements Signed-off-by: Elena Rastorgueva <[email protected]> * Remove need for user to specify model_downsample_factor Signed-off-by: Elena Rastorgueva <[email protected]> * change model.cfg.sample_rate to model.cfg.preprocessor.sample_rate Signed-off-by: Elena Rastorgueva <[email protected]> * add check to make sure that window_stride is in model.cfg.preprocessor Signed-off-by: Elena Rastorgueva <[email protected]> * reduce memory consumption of backpointers by making them relative instead of absolute Signed-off-by: Elena Rastorgueva <[email protected]> * update librosa.get_duration() 'filename' param to 'path' Signed-off-by: Elena Rastorgueva <[email protected]> * Do not throw error if 'text' or 'pred_text' are empty and make sure CTM filepaths in the output manifest are null Signed-off-by: Elena Rastorgueva <[email protected]> * preprocess input text by removing any duplicate spaces and converting any newlines to spaces Signed-off-by: Elena Rastorgueva <[email protected]> * Use Utterance dataclass instead of dictionaries for keeping track of token/word/segment alignments Signed-off-by: Elena Rastorgueva <[email protected]> * refactor so can save alignments as ctm and ass format files Signed-off-by: Elena Rastorgueva <[email protected]> * fix bugs for saving character based ASS files and for using pred_text to do alignment Signed-off-by: Elena Rastorgueva <[email protected]> * Make token level .ass file use tokens with recovered capitalization Signed-off-by: Elena Rastorgueva <[email protected]> * Do not try to generate alignment files if text or pred text is empty, or if number of tokens is too large for T Signed-off-by: Elena Rastorgueva <[email protected]> * rename output manifest file to say '_with_output_file_paths.json' Signed-off-by: Elena Rastorgueva <[email protected]> * add flag to resegment ass subtitle file to fill available text space Signed-off-by: Elena Rastorgueva <[email protected]> * Fix bug in resegmentation code Signed-off-by: Elena Rastorgueva <[email protected]> * Fix bug which skipped some utterances if batch_size more than 1 Signed-off-by: Elena Rastorgueva <[email protected]> * reduce memory requirements by doing torch.gather on a slice of the log probs when they are needed Signed-off-by: Elena Rastorgueva <[email protected]> * reduce memory requirements by not saving whole v_matrix Signed-off-by: Elena Rastorgueva <[email protected]> * remove any extra spaces in pred_text Signed-off-by: Elena Rastorgueva <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused list pred_text_all_lines Signed-off-by: Elena Rastorgueva <[email protected]> * support using hybrid Transducer-CTC models for alignment Signed-off-by: Elena Rastorgueva <[email protected]> * fix typo - add brackets to torch.cuda.is_available() Signed-off-by: Elena Rastorgueva <[email protected]> * make sure token case restoration will work if superscript or subscript num is in text Signed-off-by: Elena Rastorgueva <[email protected]> * remove any BOM from input text Signed-off-by: Elena Rastorgueva <[email protected]> * pick out 1st hypotheses if there is a tuple of them Signed-off-by: Elena Rastorgueva <[email protected]> * Remove print statement Signed-off-by: Elena Rastorgueva <[email protected]> * add detail to error message if fail to recover capitalization of tokens Signed-off-by: Elena Rastorgueva <[email protected]> * add flag use_local_attention Signed-off-by: Elena Rastorgueva <[email protected]> * rename additional_ctm_grouping_separator -> additional_segment_grouping_separator Signed-off-by: Elena Rastorgueva <[email protected]> * update description of additional_segment_grouping_separator Signed-off-by: Elena Rastorgueva <[email protected]> * add simple docstring to get_utt_obj function Signed-off-by: Elena Rastorgueva <[email protected]> * Make docstring for add_t_start_end_to_utt_obj Signed-off-by: Elena Rastorgueva <[email protected]> * update docstrings for add_t_start_end_to_utt_obj and get_batch_variables Signed-off-by: Elena Rastorgueva <[email protected]> * update README and comments in align.py Signed-off-by: Elena Rastorgueva <[email protected]> * change 'ground truth' -> 'reference text' in documentation Signed-off-by: Elena Rastorgueva <[email protected]> * add header Signed-off-by: Elena Rastorgueva <[email protected]> * add comments to get_utt_obj function Signed-off-by: Elena Rastorgueva <[email protected]> * move constants so they are after imports Signed-off-by: Elena Rastorgueva <[email protected]> * add file description for make_ass_files Signed-off-by: Elena Rastorgueva <[email protected]> * get rid of Utterance object's S attribute, and correct tests so they pass now Signed-off-by: Elena Rastorgueva <[email protected]> * remove some unused variables Signed-off-by: Elena Rastorgueva <[email protected]> * remove unused variable model from functions saving output files Signed-off-by: Elena Rastorgueva <[email protected]> * remove unused var minimum_timestamp_duration from make_ass_files functions and return utt_obj Signed-off-by: Elena Rastorgueva <[email protected]> * move minimum_timestamp_duration param to CTMFileConfig Signed-off-by: Elena Rastorgueva <[email protected]> * remove unused enumerate and unused import Signed-off-by: Elena Rastorgueva <[email protected]> * switch reading duration from librosa to soundfile to avoid filename/path deprecation message Signed-off-by: Elena Rastorgueva <[email protected]> --------- Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Added rouge monitoring support for T5 (#6737) * Added rouge monitoring support for t5 Signed-off-by: Matvei Novikov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * GPT extrapolatable position embedding (xpos/sandwich/alibi/kerple) and Flash Attention (#6666) * move to nvidia megatron repo (#6465) (#6475) Signed-off-by: Abhinav Khattar <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * Megatron KERPLE positional embeddings (#6478) (#6480) * [TTS] FastPitch adapter fine-tune and conditional layer normalization (#6416) [TTS] FastPitch adapter fine-tune and conditional layer normalization (#6416) --------- * [TTS] whitelist broken path fix. (#6412) * [TTS] whitelist broken path fix. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * [TTS] FastPitch speaker encoder (#6417) * Add initial codes * Remove wemb * Fix import * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Restore aligner loss * Add ConditionalInput * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix error and support pre-trained config * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Follow comments * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Rename config * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change copyright and random weight test * Add initial codes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix import error * Add initial codes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix dataset error * Remove reference speaker embedding * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove SV encoder * Follow comments * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix length type * Fix append * Move error msg * Add look-up into speaker encoder * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add valueerror msg * Move lookup * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix error * Rebase and Fix error * Fix spk encoder * Rename n_speakers * Follow comments * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix n_speakers None error --------- * Sharded manifests for tarred datasets (#6395) * testing sharded manifests * compatibility * proper fixes * adding flag tot convert_to_tarred_audio_dataset * shard_manifests conf param * propagating the shard_manifests param * propagating the shard_manifests param * distributed checks * typo * typo * fixes * fixes * fixes * fixes * fixes * fixes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixes based on PR comments and tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixes to convert_to_tarred_audio_dataset.py * reversing manifest shards flag * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * tests * excluding manifests from webdataset url expansion * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * expand manifest paths before attempting to cache from datastore * explicit use of UTF-8 for manifest i/o --------- * Update wfst_text_normalization.rst (#6374) Add Hungarian (incoming in NeMo-text-processing) * Support Swiglu in TP PP Conversion (#6437) (#6451) * Support Swiglu in TP PP Conversion * Guard activation * Guard activation --------- * Update NeMo_TTS_Primer.ipynb (#6436) * Update NeMo_TTS_Primer.ipynb Changed a mistake in line 782. Instead of frequency band (ie. pitch) we should write frequency bin. Note that frequency bins in FFT are not related to pitch. * Update NeMo_TTS_Primer.ipynb Corrected the description of spectrogram and mel spectrogram calculations in lines 782 & 783 and added a fourth point to the description and added a reference for more mathematical details at the end of this point. --------- * add rampup batch size support for Megatron GPT (#6424) * added rampup batch size support * added tests for rampup batch size * fixed the typos * added assertions * changed assertion rules * deleted unused imports * changed tests for rampup batch size * updated rampup batch size tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed styling * rampup batch size tests changes --------- * Meagtron encoder decoder fix for empty validation outputs (#6459) (#6461) * 1. Meagtron encoder decoder fix for empty validation outputs. * 1. Debugging. --------- * Code-Switching dataset creation - upgrading to aggregate tokenizer manifest format (#6448) * added functionality to create agg tokenizer compatible manifest for CS, flag to use this mode by default * updated README with the new agg_tokenizer_manifest flag * fixed typo in scripts/speech_recognition/code_switching/README.md * changed agg_tokenizer_manifest to is_lid_manifest --------- * Added/updated new Conformer configs (#6426) (#6467) * Update script for ngram rnnt and hat beam search decoding (#6370) * add rnnt ngram beamsearch script * add return encoding embedding option * update script * add rnnt and hat ngram decoding script * add some parameters * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add return_encoder_embeddings parameter to RNNTDecodingConfig * replace return_encoder_embeddings parameter * generalization of scipt behavior * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove return_encoder_embeddings parameter * remove return_encoder_embeddings parameter * add manual encoder_embeddings calculation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix beam_width value to 8 * fix rescoring description --------- * BERT pre-training mp fork to spawn (#6442) (#6454) * change bert fork to spawn * num_workers=0 fix --------- * fix replace_bos_with_pad not found (#6443) (#6450) * reduce workers on NMT CI (#6472) (#6474) * 1. Added KERPLE positional embeddings to encoder-decoder. * 1. Added a missing file. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * 1. Fixing commits. * 1. Debugging. * 1. Debugging. * 1. Debugging. * 1. Debugging. --------- Signed-off-by: hsiehjackson <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Jim O’Regan <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Mostafa Ghorbandoost <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Dmytro Pykhtar <[email protected]> Signed-off-by: Micha Livne <[email protected]> Signed-off-by: Kunal Dhawan <[email protected]> Signed-off-by: andrusenkoau <[email protected]> Signed-off-by: Andrei Andrusenko <[email protected]> Signed-off-by: Abhinav Khattar <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Cheng-Ping Hsieh <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Jim O’Regan <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Mostafa Ghorbandoost <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Kunal Dhawan <[email protected]> Co-authored-by: Andrei Andrusenko <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * Fix an invalid link in get_data.py of ljspeech (#6456) Usage of the link in line 63 leads to downloading a html file not a tsv file, so we need to change it to a raw link. Signed-off-by: Mostafa Ghorbandoost <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * 1. Added external index sample. (#6462) (#6483) Signed-off-by: Micha Livne <[email protected]> Co-authored-by: Micha Livne <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * Update README to add core installation (#6488) (#6489) * update README for megatron-core * fix --------- Signed-off-by: Abhinav Khattar <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * Fix cache aware hybrid bugs (#6466) (#6484) Signed-off-by: hsiehjackson <[email protected]> * Fix typos (#6494) (#6495) Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * Add disclaimer about dataset for ASR (#6496) Signed-off-by: smajumdar <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * fix (#6502) datastore_path_to_webdataset_url(p) if is_datastore_path(p) and is_tarred_path(p) else p NameError: name 'is_tarred_path' is not defined Co-authored-by: George <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * fix broken links r1.18.0 (#6501) (#6504) * fix broken links * fix broken links --------- Signed-off-by: Evelina <[email protected]> Co-authored-by: Evelina <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * [TTS] Create functions for TTS preprocessing without dataloader (#6317) * [TTS] Create functions for TTS preprocessing without dataloader Signed-off-by: Ryan <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * Cache aware streaming nfa (#6209) * add cache aware streaming to nemo aligner Signed-off-by: Slyne Deng <[email protected]> Signed-off-by: hsiehjackson <[email protected]> * [BugFix] Force _get_batch_preds() to keep logits in decoder timestamps generator (#6499) * [BugFix] _get_batch_preds() is forced to keep logits in decoder timestamps generators Signed-off-by: Taejin Park <[email protected]> * Ingnore keep_logits boolean in FrameASRBatchLogits Signed-off-by: Taejin Park <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Signed-off-by: hsiehjackson <[email protected]> …

rlangman requested review from XuesongYang, redoctopus, racoiaws and subhankar-ghosh May 17, 2023 22:37

github-actions bot added ASR core Changes to NeMo Core TTS labels May 17, 2023

rlangman force-pushed the tts_dataset branch from 28f067d to 7fae52a Compare May 17, 2023 22:38

Base automatically changed from tts_callback to main May 18, 2023 16:39

rlangman force-pushed the tts_dataset branch from 7fae52a to 6afba14 Compare May 18, 2023 16:58