Skip to content

NVIDIA Neural Modules 1.17.0

Compare
Choose a tag to compare
@ericharper ericharper released this 05 Apr 00:10
· 1421 commits to main since this release
d3017e4

Highlights

NeMo ASR

  • Online Clustering Diarizer
  • High Level Diarization API
  • PyCTC Decode Beam Search Support
  • RNNT Beam Search Alignment Extraction
  • InterCTC Loss
  • AIStore Documentation
  • ASR & AWS Multi-node Integration
  • Convolution Invariant SDR losses

NeMo TTS

NeMo Megatron

  • SqaredReLU, SwiGLU, No-Dropout
  • Rotary Position Embedding
  • Untie word embeddings and output projection

NeMo Core

  • Dynamic freezing of modules during training
  • NeMo Multi-Run Documentation
  • ClearML Logging
  • Early Stopping
  • Experiment Manager Docs Update

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.02

Detailed Changelogs

ASR

Changelog
  • Support Alignment Extraction for all RNNT Beam decoding methods by @titu1994 :: PR: #5925
  • Use module-based k2 import guard by @artbataev :: PR: #6006
  • Default RNNT loss to int64 targets by @titu1994 :: PR: #6011
  • Added documentation section for ASR datasets from AIStore by @anteju :: PR: #6008
  • Change perturb rng for reproducing results easily by @fayejf :: PR: #6042
  • InterCTC loss and stochastic depth implementation by @Kipok :: PR: #6013
  • Add pyctcdecode to high level beam search API by @titu1994 :: PR: #6026
  • Convert esperanto into a notebook by @SeanNaren :: PR: #6070
  • [ASR] Added a script for evaluating metrics for audio-to-audio by @anteju :: PR: #5971
  • [ASR] Convolution-invariant SDR loss + unit tests by @anteju :: PR: #5992
  • Adjust stochastic depth dropout probability calculation by @anteju :: PR: #6120
  • Add file class based inference API for diarization by @SeanNaren :: PR: #5945
  • Ngram by @karpnv :: PR: #6063
  • remove duplicate definition of manifest read and write func. by @XuesongYang :: PR: #6088
  • Streaming conformer CTC export by @messiaen :: PR: #5837
  • [TTS] Make mel spectrogram norm configurable by @rlangman :: PR: #6155
  • Ngram lm fusion for RNNT maes decoding by @andrusenkoau :: PR: #6118
  • ASR Beam search documentation by @titu1994 :: PR: #6244

TTS

Changelog
  • [TTS][ZH] added new NGC model cards with polyphone disambiguation. by @XuesongYang :: PR: #5940
  • [TTS] deprecate AudioToCharWithPriorAndPitchDataset. by @XuesongYang :: PR: #5959
  • [TTS][G2P] deprecate add_symbols by @XuesongYang :: PR: #5961
  • Added list_available_models by @treacker :: PR: #5967
  • Update Fastpitch energy bug by @blisc :: PR: #5969
  • removed WHATEVER(1) ˌhwʌˈtɛvɚ from scripts/tts_dataset_files/ipa_cmudict-0.7b_nv22.10.txt by @MikyasDesta :: PR: #5869
  • ONNX export for RadTTS by @borisfom :: PR: #5880
  • Add some info about FastPitch SSL model by @redoctopus :: PR: #5994
  • Vits doc by @treacker :: PR: #5989
  • Ragged batching changes for RadTTS, some refactoring by @borisfom :: PR: #6020
  • Working enabled ragged batching with ONNX by @borisfom :: PR: #6030
  • [TTS/TN/G2P] Remove Text Processing from NeMo, move G2P to TTS by @ekmb :: PR: #5982
  • [TTS] Add Spanish IPA dictionaries and heteronyms by @rlangman :: PR: #6037
  • [TTS] Separate TTS tokenization and g2p util to fix circular import by @rlangman :: PR: #6080
  • [TTS][refactor] Part 7 - move module from model file. by @XuesongYang :: PR: #6098
  • [TTS][refactor] Part 1 - nemo.collections.tts.data by @XuesongYang :: PR: #6099
  • [TTS][refactor] Part 2 - nemo.colletions.tts.parts by @XuesongYang :: PR: #6105
  • [TTS][refactor] Part 6 - remove nemo.collections.tts.torch.README.md and tts_dataset.yaml by @XuesongYang :: PR: #6103
  • [TTS][refactor] Part 3 - nemo.collections.tts.g2p.models by @XuesongYang :: PR: #6113
  • [TTS] update German NGC models trained on Thorsten Datasets by @XuesongYang :: PR: #6125
  • [TTS] remove old waveglow model that relies on torch_stft. by @XuesongYang :: PR: #6128
  • [TTS] Move Spanish polyphones from heteronym to dictionary by @rlangman :: PR: #6123
  • [TTS][refactor] Part 8 - added model inference tests to safeguard changes. by @XuesongYang :: PR: #6129
  • remove duplicate definition of manifest read and write func. by @XuesongYang :: PR: #6088
  • [TTS][refactor] update tutorial import paths. by @XuesongYang :: PR: #6176
  • [TTS] Add univnet scheduler by @ArtyomZemlyak :: PR: #6157
  • [TTS] Make mel spectrogram norm configurable by @rlangman :: PR: #6155

NLP / NMT

Changelog

Text Normalization / Inverse Text Normalization

Changelog
  • [TTS/TN/G2P] Remove Text Processing from NeMo, move G2P to TTS by @ekmb :: PR: #5982

Export

Changelog

Bugfixes

Changelog

General Improvements

Changelog