Skip to content

Releases: NVIDIA/NeMo

NVIDIA Neural Modules 1.8.0

20 Apr 04:29
Compare
Choose a tag to compare

Known Issues

Issues
  • Megatron BERT export does not currently work in the NVIDIA NGC PyTorch 22.03 container. The issue will be fixed in the NGC PyTorch 22.04 container.
  • pytest for Vietnamese inverse text normalization are failing. Fixed in main

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.03

ASR

Changelog

TTS

Changelog
  • Bump TTS deprecation version to 1.9 by @blisc :: PR: #3955
  • Add pinned pynini and scipy installs to TTS training tutorial by @redoctopus :: PR: #3967
  • Compatability override to load_state_dict for old TTS checkpoints by @redoctopus :: PR: #3978

NLP / NMT

Changelog

Text Normalization / Inverse Text Normalization

Changelog

Export

Changelog

Bugfixes

Changelog

General Improvements

Changelog

NVIDIA Neural Modules 1.7.2

17 Mar 22:35
c16b894
Compare
Choose a tag to compare

GPT Bugfixes

T5 XNLI Example

NVIDIA Neural Modules 1.7.1

08 Mar 03:04
d5ad011
Compare
Choose a tag to compare

Known Issues

  • find_unused_parameters should be False when training GPT: #3837

Bugfixes

NVIDIA Neural Modules 1.7.0

02 Mar 00:57
256236f
Compare
Choose a tag to compare

Known Issues

  • Megatron GPT training with O2 and FP16 is bugged. FP16 and O1 still works.
  • find_unused_parameters should be False when training GPT: #3837
  • FastPitch training may result in stalled GPUs. Users will have to manually kill their runs and continue training from the latest checkpoint.
  • mT5 issue with whole word masking, see #3776
  • T5 finetuning config issue, see #3776

Container

NOTE: From NeMo 1.7.0 onwards, NeMo containers will follow the YY.MM conversion for naming, where the YY.MM value is based on the base container. For additional information regarding NeMo containers, please visit : https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.01

ASR

TTS

  • port UnivNet to NeMo TTS collection by @L0SG :: PR: #3186
  • E2E TTS fixes by @redoctopus :: PR: #3508
  • New structure for TTS datasets in scripts/dataset_processing, VocoderDataset, update TTSDataset by @Oktai15 :: PR: #3484
  • Depreciate some TTS models and TTS datasets by @Oktai15 :: PR: #3576
  • Fix bugs in HiFi-GAN (scheduler, optimizers) and add input_example() in Mixer-TTS/Mixer-TTS-X by @Oktai15 :: PR: #3564
  • Update UnivNet, HiFi-GAN and WaveGlow, small fixes in Mixer-TTS, FastPitch and Exportable by @Oktai15 :: PR: #3585
  • Fix typo in FastPitch config (pitch_avg -> pitch_mean) by @eyentei :: PR: #3593
  • Fix incorrect usage of TTSDataset in some files and fix one-line bug in NVIDIA's CMUDict by @Oktai15 :: PR: #3594
  • Convert entry from UTF-16 to UTF-8 by @redoctopus :: PR: #3597
  • remove CheckInstall by @blisc :: PR: #3577
  • Fix UnivNet LibriTTS pretrained location by @m-toman :: PR: #3615
  • FastPitch training tutorial by @subhankar-ghosh :: PR: #3631
  • Update Aligner, add new methods to AlignmentEncoder by @Oktai15 :: PR: #3641
  • Add Mixed Representation Training by @blisc :: PR: #3473
  • Add speakerID to libritts/get_data.py by @subhankar-ghosh :: PR: #3662
  • Update TTS tutorials, Simplification of testing Mixer-TTS and FastPitch by @Oktai15 :: PR: #3680
  • Clean FastPitch_Finetuning.ipynb notebook by @Oktai15 :: PR: #3698
  • Add cache_size to BetaBinomialInterpolator, fix bugs in TTS tutorials and FastPitch by @Oktai15 :: PR: #3706
  • Fix bugs in VocoderDataset and TTSDataset by @Oktai15 :: PR: #3713
  • Fix bugs in E2E TTS, Mixer-TTS and FastPitch by @Oktai15 :: PR: #3740

NLP / NMT

Text Normalization / Inverse Text Normalization

Export

Bugfixes

  • Text normalization takes too much time for a string which contains a lot of dates by @PeganovAnton :: PR: #3451
  • Dialogue state tracking refactor/ SGDGEN patch 2 by @Zhilin123 :: PR: #3674
  • lower bound PTL to 1.5.10 and remove last ckpt patch fix by @nithinraok :: PR: #3690

Improvements

NVIDIA Neural Modules 1.6.2

05 Feb 06:09
Compare
Choose a tag to compare

Bug fix

  • Changed Apex not found error to warning to enable NLP models which aren't apex dependent when Apex isn't installed.

NVIDIA Neural Modules 1.6.1

02 Feb 06:25
Compare
Choose a tag to compare

Bug Fixes

  • Fix embedding name for verifying speakers #3578
  • Add rank check and barrier helpers compilation for megatron dataset #3581
  • Add apex import guards #3579

NVIDIA Neural Modules 1.6.0

29 Jan 04:53
Compare
Choose a tag to compare

ASR

  • Add new features to ASR with diarization with modified tutorial and README. by @tango4j :: PR: #3007
  • Enable stateful decoding of RNNT over multiple transcribe calls by @titu1994 :: PR: #3037
  • Move vocabs from asr to common by @Oktai15 :: PR: #3084
  • Adding parallel transcribe for ASR models - suppports multi-gpu/multi-node by @VahidooX :: PR: #3017
  • CTC Conformer fixes for ONNX/TS export by @borisfom :: PR: #3072
  • Adding pretrained French ASR models to ctc_bpe and rnnt_bpe listings by @tbartley94 :: PR: #3225
  • adding german conformer ctc and rnnt by @yzhang123 :: PR: #3242
  • Add aishell and fisher dataset processing scripts for ASR by @jbalam-nv :: PR: #3203
  • Better default for RNNT greedy decoding by @titu1994 :: PR: #3332
  • Add uniform ASR evaluation script for all models by @titu1994 :: PR: #3334
  • CTC Segmentation-Citrinet support by @ekmb :: PR: #3279
  • Updates on ASR with diarization util files by @tango4j :: PR: #3359
  • Asr fr by @tbartley94 :: PR: #3404
  • Refactor ASR Examples Directory by @titu1994 :: PR: #3392
  • Asr patches by @titu1994 :: PR: #3443
  • Properly support -1 for labels in ctc char models by @titu1994 :: PR: #3487

TTS

  • MixerTTS, MixerTTSDataset and small updates in tts tokenizers by @Oktai15 :: PR: #2859
  • ONNX and TorchScript support for Mixer-TTS by @Oktai15 :: PR: #3082
  • Update name of files to one style in TTS folder by @Oktai15 :: PR: #3189
  • Update TTS Dataset, FastPitch with TTS dataset and small improvements in HiFiGAN by @Oktai15 :: PR: #3205
  • Add Beta-binomial Interpolator to TTSDataset by @Oktai15 :: PR: #3230
  • Normalizer to TTS models, TTS tokenizer updates, AxisKind updates by @Oktai15 :: PR: #3271
  • Update Mixer-TTS, FastPitch and TTSDataset by @Oktai15 :: PR: #3366
  • Minor Updates to TTS Finetuning by @blisc :: PR: #3455

NLP / NMT

Text Normalization / Inverse Text Normalization

NeMo Tools

  • CTC Segmentation-Citrinet support by @ekmb :: PR: #3279
  • Updated NumPy SDE requirement by @vsl9 :: PR: #3442

Export

Documentation

Bugfixes

Read more

NVIDIA Neural Modules 1.5.1

04 Dec 00:00
Compare
Choose a tag to compare

Features

  • Minor updates to expose speaker id, pitch, and duration on export of FastPitch #3192, #3207

Known Issues

NVIDIA Neural Modules 1.5.0

20 Nov 01:55
Compare
Choose a tag to compare

Features

  • Megatron GPT pre-training with tensor model parallelism #2975
  • NMT encoder and decoder with different hidden size #2856
  • Logging timing of train/val/test steps #2936
  • Logging NMT encoder and decoder timing #2956
  • Logging timing per sentence length and tokenized text statistics #3004
  • Upgrade to PyTorch Lightning 1.5.0, bfloat support #2975
  • French Inverse Text Normalization #2921
  • Bucketing of tarred datasets for ASR models #2999
  • ASR with diarization #3007
  • Adding parallel transcribe for ASR models - suppports multi-gpu/multi-node #3017

Documentation Updates

  • RNNT

Contributors

@ericharper @michalivne @MaximumEntropy @VahidooX @titu1994 @blisc @okuchaiev @tango4j @erastorgueva-nv @fayejf @vadam5 @ekmb @yaoyu-33 @nithinraok @erhoo82 @tbartley94 @PeganovAnton @madhukarkm @yzhang123
(Please let us know if you have contributed to this release and we have missed you here.)

NVIDIA Neural Modules 1.4.0

02 Oct 00:49
0958184
Compare
Choose a tag to compare

Features

  • Improved speaker clustering #2729
  • Upgrade to NVIDIA PyTorch 21.08 container #2799
  • RNNT mAES beam search support #2802
  • Transfer learning for new speakers #2684
  • Simplify speaker scripts #2777
  • Perceiver-encoder architecture #2737
  • Relative paths in tarred datasets #2776
  • Torch only TTS package #2643
  • Inverse text normalization for Spanish #2489

Tutorial Notebooks

  • Duration and pitch control for TTS # 2700

Bug fixes

Contributors

@tango4j @titu1994 @paarthneekhara @nithinraok @michalivne @erastorgueva-nv @borisfom @blisc
(some contributors may not be listed explicitly)