20 Apr 04:29

d9a329e

Known Issues

Issues

Megatron BERT export does not currently work in the NVIDIA NGC PyTorch 22.03 container. The issue will be fixed in the NGC PyTorch 22.04 container.
pytest for Vietnamese inverse text normalization are failing. Fixed in main

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.03

ASR

Changelog

ASR SSL Update by @sam1373 :: PR: #3714
Polylang asr by @bmwshop :: PR: #3721
Test grad accumulation for RNNT loss by @titu1994 :: PR: #3731
Add readme files describing model execution flow for ASR tasks by @titu1994 :: PR: #3812
add fr asr ckpt to doc by @yzhang123 :: PR: #3809
Fix asr tests in 22.02 by @titu1994 :: PR: #3823
Add new pretrained Spanish ASR models by @erastorgueva-nv :: PR: #3830
Documentation updates for ASR by @titu1994 :: PR: #3846
Offline VAD+ASR tutorial by @fayejf :: PR: #3828
Added Hindi and Marathi Models in Nemo pretrained ASR_CTC_BPE models … by @meghmak13 :: PR: #3856
Add a missing line to ASR_with_NeMo.ipynb by @lifefeel :: PR: #3908
Multilang asr models by @bmwshop :: PR: #3907
added stt_en_conformer_transducer_large_ls to NGC by @VahidooX :: PR: #3920
Fix DALI test on 22.03 by @titu1994 :: PR: #3911
Adding RNN encoder for LSTM-Transducer and LSTM-CTC models by @VahidooX :: PR: #3886
Fix issue with Segfault in ASR models by @titu1994 :: PR: #3956
Added Mandarin pretrained Conformer-Transducer-Large model trained on AISHELL2. by @VahidooX :: PR: #3970

TTS

Changelog

Bump TTS deprecation version to 1.9 by @blisc :: PR: #3955
Add pinned pynini and scipy installs to TTS training tutorial by @redoctopus :: PR: #3967
Compatability override to load_state_dict for old TTS checkpoints by @redoctopus :: PR: #3978

NLP / NMT

Changelog

Use worker processes for data preprocessing by @crcrpar :: PR: #3665
Set find_unused_parameters to False in GPT example script by @ericharper :: PR: #3837
GPT multinode eval by @ericharper :: PR: #3821
Fix MegatronPretrainingRandomSampler by taking into account by @crcrpar :: PR: #3826
Add slot filling into DST Generative model by @Zhilin123 :: PR: #3695
Disable nvfuser for gpt by @ericharper :: PR: #3845
Multi-Label Joint Intent Slot Classification by @chenrichard10 :: PR: #3742
fix bug in intent/slot model reloading by @carolmanderson :: PR: #3874
Make test_gpt_eval unit test less strict by @yidong72 :: PR: #3898
Comment gpt resume ci test by @MaximumEntropy :: PR: #3901
Neural Machine Translation with Megatron Transformer Models (Tensor Parallel and Tarred Datasets Only) by @MaximumEntropy :: PR: #3861
Megatron support by @ramanathan831 :: PR: #3893
Populate the GPT/BERT with uploaded models by @yidong72 :: PR: #3885
Megatron BART by @michalivne :: PR: #3666
Additional Japanese processor for NMT that uses MeCab segmentation. Fix for BLEU in one-many NMT by @MaximumEntropy :: PR: #3889
NMT GRPC sever URL fix by @MaximumEntropy :: PR: #3918
Megatron legacy conversion support by @ramanathan831 :: PR: #3919
Update max_epochs on megatron configs by @ericharper :: PR: #3958
Fix NMT variable passing bug by @aklife97 :: PR: #3985
Fix nemo megatron restore with artifacts by @ericharper :: PR: #3997
Fix megatron notebook by @ramanathan831 :: PR: #4004
Megatron work-arounds by @borisfom :: PR: #3998
Add T5 model P-tuning support by @yidong72 :: PR: #3768
Make index mappings dir configurable by @ericharper :: PR: #3868
T5 pipeline parallel by @MaximumEntropy :: PR: #3750

Text Normalization / Inverse Text Normalization

Changelog

Tn es by @bonham79 :: PR: #3632
Fix single GPU training issue + change deprecated Lightning args by @aklife97 :: PR: #4010

Export

Changelog

Conformer WARs for TRT8.2 by @borisfom :: PR: #3787
bert_module: fix inputs of export model by @virajkarandikar :: PR: #3815
Exports 22.03 war by @borisfom :: PR: #3957

Bugfixes

Changelog

patch librosa deprecation and fix by @fayejf :: PR: #3818

General Improvements

Changelog

Pynini pip by @yzhang123 :: PR: #3726
upgrade PTL trainer flags by @nithinraok :: PR: #3589
Updated Speech Data Explorer by @vsl9 :: PR: #3710
Fix spelling error in num_workers parameter to actually set number of dataset workers specified in yaml configs by @themikem :: PR: #3800
Support for Camembert Huggingface bert-like models by @itzsimpl :: PR: #3799
Update to 22.02 by @ericharper :: PR: #3771
Fixing the defaults of conformer models in the config files by @VahidooX :: PR: #3836
Fix T5 Encoder Mask while decoding by @MaximumEntropy :: PR: #3838
fix: multilingual transcribe does not require lang id param by @bmwshop :: PR: #3833
Misc improvements by @titu1994 :: PR: #3843
Change container by @MaximumEntropy :: PR: #3844
Making gender assignment random for cardinals, fractions, and decimal… by @bonham79 :: PR: #3759
Jenkinsfile test changes by @chenrichard10 :: PR: #3879
Adding a RegEx tokenizers by @michalivne :: PR: #3839
enable bias+dropout+add fusion with nvfuser at inference by @erhoo82 :: PR: #3869
Add text_generation_util to support TopK, TopP sampling + Tabular Data Generation. by @yidong72 :: PR: #3834
Ptl requirements bound by @MaximumEntropy :: PR: #3903
doc links update by @ekmb :: PR: #3891
add citations by @yzhang123 :: PR: #3902
Update NeMo CI to 22.03 by @MaximumEntropy :: PR: #3900
Add domain groups to changelog builder by @titu1994 :: PR: #3904
add input threshhold by @yzhang123 :: PR: #3913
improvements to commonvoice data script by @bmwshop :: PR: #3892
fixes to the cleanup flag by @bmwshop :: PR: #3921
Upgrade to PTL 1.6.0 by @ericharper :: PR: #3890
JSON output from diarization now includes sentences. Optimized senten… by @demsarjure :: PR: #3897
Stateless timer fix for PTL 1.6 by @MaximumEntropy :: PR: #3925
fix save_best missing chpt bug, update for setup_tokenizer() changes by @ekmb :: PR: #3932
Fix tarred sentence dataset length by @MaximumEntropy :: PR: #3941
remove old doc by @ekmb :: PR: #3946
Fix issues with librosa deprecations by @titu1994 :: PR: #3950
Fix notebook bugs for branch r1.8.0 by @yidong72 :: PR: #3948
Fix global batch fit loop by @ericharper :: PR: #3936
Refactor restorefrom by @ramanathan831 :: PR: #3927
Fix variable name and move models to CPU in Change partition by @aklife97 :: PR: #3972
Fix notebook error by @yidong72 :: PR: #3975
Notebook Bug Fixes for r1.8.0 by @vadam5 :: PR: #3989
Fix compat override for TalkNet Aligner by @redoctopus :: PR: #3993
docs fixes by @ekmb :: PR: #3987
Fixes val_check_interval, skip loading train data during eval by @MaximumEntropy :: PR: #3968
LogProb calculation performance fix by @yidong72 :: PR: #3984
Fix P-Tune T5 model by @yidong72 :: PR: #4001
Fix the broadcast shape mismatch by @yidong72 :: PR: #4017
Add known issues to notebook by @ericharper :: PR: #4024

Contributors

lifefeel, bmwshop, and 30 other contributors

Assets 2

17 Mar 22:35

ericharper

v1.7.2

c16b894

NVIDIA Neural Modules 1.7.2

GPT Bugfixes

GPT dataloader improvements and fixes by @crcrpar :: PRs #3826 , #3665
Disable nvfuser by @ericharper :: PR #3845
Set find_unused_parameters to False by @ericharper :: PR #3837

T5 XNLI Example

T5 xnli eval by @yaoyu-33 :: PR: #3848

Contributors

ericharper, crcrpar, and yaoyu-33

Assets 2

08 Mar 03:04

ericharper

v1.7.1

d5ad011

NVIDIA Neural Modules 1.7.1

Known Issues

find_unused_parameters should be False when training GPT: #3837

Bugfixes

revert changes by @yzhang123 :: PR: #3785
Fixed soft prompt eval loading bug by @vadam5 :: PR: #3805
mT5 whole word masking and T5 finetuning config fixes by @MaximumEntropy :: PR: #3776
Raise error if FP16 training is tried with O2 recipe. by @ericharper :: PR: #3806

Contributors

yzhang123, MaximumEntropy, and 2 other contributors

Assets 2

02 Mar 00:57

ericharper

v1.7.0

256236f

NVIDIA Neural Modules 1.7.0

Known Issues

Megatron GPT training with O2 and FP16 is bugged. FP16 and O1 still works.
find_unused_parameters should be False when training GPT: #3837
FastPitch training may result in stalled GPUs. Users will have to manually kill their runs and continue training from the latest checkpoint.
mT5 issue with whole word masking, see #3776
T5 finetuning config issue, see #3776

Container

NOTE: From NeMo 1.7.0 onwards, NeMo containers will follow the YY.MM conversion for naming, where the YY.MM value is based on the base container. For additional information regarding NeMo containers, please visit : https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.01

ASR

Wav2vec by @tbartley94 :: PR: #3297
Fix bug in multi-checkpoint loading by @sam1373 :: PR: #3536
Add HuggingFace Datasets to NeMo ASR Dataset script by @titu1994 :: PR: #3513
Add support for Gradient Clipping (clamp) in RNNT Numba loss by @titu1994 :: PR: #3550
Enable Tarred Dataset Support for NVIDIA DALI by @titu1994 :: PR: #3485
Add initial support for Buffered RNNT Scripts by @titu1994 :: PR: #3602
Significantly speed up RNNT loss on CUDA by @titu1994 :: PR: #3653
Fixing the bug in the stateful rnnt decoder. by @VahidooX :: PR: #3673
Add Buffered RNNT with LCS Merge algorithm by @titu1994 :: PR: #3669
Asr noise data scripts by @jbalam-nv :: PR: #3660
ASR SSL update by @sam1373 :: PR: #3746
Add randomized bucketing by @VahidooX :: PR: #3445
Self-supervised tutorial & update by @sam1373 :: PR: #3344
Updated conformer models. by @VahidooX :: PR: #3741
Added speaker identification script with cosine and neural classifier… by @nithinraok :: PR: #3672
Fix in clustering diarizer by @nithinraok :: PR: #3701
Add a function that writes cluster label in diarization pipeline by @tango4j :: PR: #3643

TTS

port UnivNet to NeMo TTS collection by @L0SG :: PR: #3186
E2E TTS fixes by @redoctopus :: PR: #3508
New structure for TTS datasets in scripts/dataset_processing, VocoderDataset, update TTSDataset by @Oktai15 :: PR: #3484
Depreciate some TTS models and TTS datasets by @Oktai15 :: PR: #3576
Fix bugs in HiFi-GAN (scheduler, optimizers) and add input_example() in Mixer-TTS/Mixer-TTS-X by @Oktai15 :: PR: #3564
Update UnivNet, HiFi-GAN and WaveGlow, small fixes in Mixer-TTS, FastPitch and Exportable by @Oktai15 :: PR: #3585
Fix typo in FastPitch config (pitch_avg -> pitch_mean) by @eyentei :: PR: #3593
Fix incorrect usage of TTSDataset in some files and fix one-line bug in NVIDIA's CMUDict by @Oktai15 :: PR: #3594
Convert entry from UTF-16 to UTF-8 by @redoctopus :: PR: #3597
remove CheckInstall by @blisc :: PR: #3577
Fix UnivNet LibriTTS pretrained location by @m-toman :: PR: #3615
FastPitch training tutorial by @subhankar-ghosh :: PR: #3631
Update Aligner, add new methods to AlignmentEncoder by @Oktai15 :: PR: #3641
Add Mixed Representation Training by @blisc :: PR: #3473
Add speakerID to libritts/get_data.py by @subhankar-ghosh :: PR: #3662
Update TTS tutorials, Simplification of testing Mixer-TTS and FastPitch by @Oktai15 :: PR: #3680
Clean FastPitch_Finetuning.ipynb notebook by @Oktai15 :: PR: #3698
Add cache_size to BetaBinomialInterpolator, fix bugs in TTS tutorials and FastPitch by @Oktai15 :: PR: #3706
Fix bugs in VocoderDataset and TTSDataset by @Oktai15 :: PR: #3713
Fix bugs in E2E TTS, Mixer-TTS and FastPitch by @Oktai15 :: PR: #3740

NLP / NMT

NLPDDPPlugin find_unused_parameters is configurable by @mlgill :: PR: #3478
Megatron encoder-decoder refactor by @michalivne :: PR: #3542
Finetuning NeMo Megatron T5 Models on GLUE by @MaximumEntropy :: PR: #3408
Pipeline parallelism for GPT by @ericharper :: PR: #3388
Generalized the P-tuning method to support various NLP tasks by @yidong72 :: PR: #3623
Megatron_LM checkpoint to NeMo checkpoint support by @yidong72 :: PR: #3692
Bugfix for GPT eval by @ericharper :: PR: #3744
Yuya/megatron t5 glue eval by @yaoyu-33 :: PR: #3751
Enforce legacy tokenizer for sentencepiece to add special tokens for T5 by @MaximumEntropy :: PR: #3457
Added P-Tuning method by @yidong72 :: PR: #3488
O2 style mixed precision training for T5 by @MaximumEntropy :: PR: #3664
LM adapted T5 dataset by @MaximumEntropy :: PR: #3654
Fix consumed samples calculation + PTune Model bugs by @yidong72 :: PR: #3738
Add pipeline support to eval methods by @ericharper :: PR: #3684
XNli benchmark by @yidong72 :: PR: #3693
Refactor dialogue state tracking for modelling/dataset interoperability by @Zhilin123 :: PR: #3526
Changes to support mean n-gram size masking for T5 by @MaximumEntropy :: PR: #3646
Dialogue state tracking refactor by @Zhilin123 :: PR: #3667
Parallel prompt tuning by @vadam5 :: PR: #3670
GEGLU activation for T5 by @MaximumEntropy :: PR: #3694

Text Normalization / Inverse Text Normalization

Text normalization takes too much time for a string which contains a lot of dates by @PeganovAnton :: PR: #3451
ITN bug fixes, ip address, card num support, whitelist clean up by @ekmb :: PR: #3574
Fix tn bugs by @yzhang123 :: PR: #3580
add serial number to itn by @yzhang123 :: PR: #3584
ITN: SH bug fixes for telephone by @ekmb :: PR: #3592
Tn bug 1.7.0 by @yzhang123 :: PR: #3730
TN docs update by @ekmb :: PR: #3735

Export

Update UnivNet, HiFi-GAN and WaveGlow, small fixes in Mixer-TTS, FastPitch and Exportable by @Oktai15 :: PR: #3585
Conformer onnx fix by @borisfom :: PR: #3524
Add onnx support for speaker models by @nithinraok :: PR: #3650
Jasper mask/export fix by @borisfom :: PR: #3691

Bugfixes

Text normalization takes too much time for a string which contains a lot of dates by @PeganovAnton :: PR: #3451
Dialogue state tracking refactor/ SGDGEN patch 2 by @Zhilin123 :: PR: #3674
lower bound PTL to 1.5.10 and remove last ckpt patch fix by @nithinraok :: PR: #3690

Improvements

Wfst tutorial by @tbartley94 :: PR: #3479
Update CMUdict with ADLR version pronunciations by @redoctopus :: PR: #3446
Fix docs by @yzhang123 :: PR: #3523
Add docstring to UnivNetModel by @L0SG :: PR: #3529
Increase lower bound due to security vulnerability by @ericharper :: PR: #3537
Add Change Log builder to NeMo by @titu1994 :: PR: #3527
Bugfix, need to freeze the model by @yidong72 :: PR: #3540
Bucketing quick fix by @tbartley94 :: PR: #3543
More fixes to SentencePiece for T5 by @MaximumEntropy :: PR: #3515
Update CONTRIBUTING.md by @Oktai15 :: PR: #3569
Update pr template and re-add Changelog builder by @titu1994 :: PR: #3575
Apex quick fix by @ekmb :: PR: #3591
Upgrade to 22.01 container by @ericharper :: PR: #3571
Fix typo and update minimal version of scipy by @Oktai15 :: PR: #3604
Add env variable to force transformers to run offline during CI by @ericharper :: PR: #3607
Correctly install NeMo wheel by @titu1994 :: PR: #3599
Fix wheel build by @titu1994 :: PR: #3610
Fixed EH and error reporting in restore_from by @borisfom :: PR: #3583
Clarifying documentation by @itzsimpl :: PR: #3616
Improve docs for finetuning by @titu1994 :: PR: #3622
Add NeMo version to all new .nemo files by @titu1994 :: PR: #3605
Update numba if NVIDIA_PYTORCH_VERSION not correct by @itzsimpl :: PR: #3614
Remove @experimental decorator in diarization related files. by @tango4j :: PR: #3625
Remove compression from .nemo files by @okuchaiev :: PR: #3626
Update adobe analytics by @ericharper :: PR: #3645
Add ssl tutorial to tutorial docs page by @sam1373 :: PR: #3649
Fix number of channels>1 issue by @ekmb :: PR: #3652
Fixed the bug in bucketing. by @VahidooX :: PR: #3663
Adding guard by @yzhang123 :: PR: #3655
Add tutorial paths by @titu1994 :: PR: #3651
Folder name update by @ekmb :: PR: #3671
Test HF online for SGD-GEN only by @MaximumEntropy :: PR: #3681
Update Librosa support to 0.9 by @titu1994 :: PR: #3682
Comment out numba in 22.01 release by @titu1994 :: PR: #3685
Fix failing tests inside of the 22.01 container in PR 3571 by @fayejf :: PR: #3609
Fixed Apex guard when imported classes are used for default values by @michalivne :: PR: #3700
Update citrinet_512.yaml by @Jorjeous :: PR: #3642
update torchaudio in Dockerfile to match torch version by @GNroy :: PR: #3637
Enforce import tests on the three domains by @titu1994 :: PR: #3702
Audio based norm speed up by @ekmb :: PR: #3703
Fix device on notebook by @titu1994 :: PR: #3732
pynini pip by @yzhang123 :: PR: #3729
Removed fp16 converting in complete method by @dimapihtar :: PR: #3709
Mirror AN4 while CMU servers are down by @titu1994 :: PR: #3743
Fix SSL configs for 1.7 by @sam1373 :: PR: #3748
Punct process bug fix by @ekmb :: PR: #3747
Specify gpus in SSL notebook by @sam1373 :: PR: #3753
Duplex model inference fix, money encoder fix by @ekmb :: PR: #3754
Update decoding strategy docs and override general value for tutorials by @titu1994 :: PR: #3755
Fix directories in ssl notebook by @sam1373 :: PR: #3758
Update Tacotron2_Training.ipynb by @blisc :: PR: #3769
Fix dockerfile by @yzhang123 :: PR: #3778
Prompt-Tuning-Documentation by @vadam5 :: PR: #3777
Prompt tuning bug fix by @vadam5 :: PR: #3780

Contributors

mlgill, titu1994, and 31 other contributors

Assets 2

05 Feb 06:09

okuchaiev

v1.6.2

7da3916

NVIDIA Neural Modules 1.6.2

Bug fix

Changed Apex not found error to warning to enable NLP models which aren't apex dependent when Apex isn't installed.

Assets 2

02 Feb 06:25

ericharper

v1.6.1

acf6bf4

NVIDIA Neural Modules 1.6.1

Bug Fixes

Fix embedding name for verifying speakers #3578
Add rank check and barrier helpers compilation for megatron dataset #3581
Add apex import guards #3579

Assets 2

29 Jan 04:53

ericharper

v1.6.0

75fd743

NVIDIA Neural Modules 1.6.0

ASR

Add new features to ASR with diarization with modified tutorial and README. by @tango4j :: PR: #3007
Enable stateful decoding of RNNT over multiple transcribe calls by @titu1994 :: PR: #3037
Move vocabs from asr to common by @Oktai15 :: PR: #3084
Adding parallel transcribe for ASR models - suppports multi-gpu/multi-node by @VahidooX :: PR: #3017
CTC Conformer fixes for ONNX/TS export by @borisfom :: PR: #3072
Adding pretrained French ASR models to ctc_bpe and rnnt_bpe listings by @tbartley94 :: PR: #3225
adding german conformer ctc and rnnt by @yzhang123 :: PR: #3242
Add aishell and fisher dataset processing scripts for ASR by @jbalam-nv :: PR: #3203
Better default for RNNT greedy decoding by @titu1994 :: PR: #3332
Add uniform ASR evaluation script for all models by @titu1994 :: PR: #3334
CTC Segmentation-Citrinet support by @ekmb :: PR: #3279
Updates on ASR with diarization util files by @tango4j :: PR: #3359
Asr fr by @tbartley94 :: PR: #3404
Refactor ASR Examples Directory by @titu1994 :: PR: #3392
Asr patches by @titu1994 :: PR: #3443
Properly support -1 for labels in ctc char models by @titu1994 :: PR: #3487

TTS

MixerTTS, MixerTTSDataset and small updates in tts tokenizers by @Oktai15 :: PR: #2859
ONNX and TorchScript support for Mixer-TTS by @Oktai15 :: PR: #3082
Update name of files to one style in TTS folder by @Oktai15 :: PR: #3189
Update TTS Dataset, FastPitch with TTS dataset and small improvements in HiFiGAN by @Oktai15 :: PR: #3205
Add Beta-binomial Interpolator to TTSDataset by @Oktai15 :: PR: #3230
Normalizer to TTS models, TTS tokenizer updates, AxisKind updates by @Oktai15 :: PR: #3271
Update Mixer-TTS, FastPitch and TTSDataset by @Oktai15 :: PR: #3366
Minor Updates to TTS Finetuning by @blisc :: PR: #3455

NLP / NMT

NMT timing and tokenizer stats utils by @michalivne :: PR: #3004
Add offsets calculation to MegatronGPTModel.complete method by @dimapihtar :: PR: #3117
NMT checkpoint averaging by @michalivne :: PR: #3096
NMT validation examples with inputs by @michalivne :: PR: #3194
Improve data pipeline for punctuation capitalization model and make other useful changes by @PeganovAnton :: PR: #3159
Reduce test time of punctuation and capitalization model by @PeganovAnton :: PR: #3286
NLP text augmentation by @michalivne :: PR: #3291
Adding Megatron NeMo Bert support by @yidong72 :: PR: #3303
Added Script to convert Megatron LM to . nemo file by @yidong72 :: PR: #3371
Support Changing Number of Tensor Parallel Partitions for Megatron by @aklife97 :: PR: #3365
Megatron AMP fix for scheduler step counter by @titu1994 :: PR: #3293
T5 Pre-training in NeMo using Megatron by @MaximumEntropy :: PR: #3036
NMT MIM mean variance fix by @michalivne :: PR: #3385
NMT Shared Embeddings Weights by @michalivne :: PR: #3340
Make saving .nemo during on_train_end configurable by @ericharper :: PR: #3427
Byte-level Multilingual NMT by @aklife97 :: PR: #3368
BioMegatron token classification tutorial fix to be compatible with current Megatron BERT by @yidong72 :: PR: #3435
NMT documentation for bottleneck architecture by @michalivne :: PR: #3464
(1) O2-style mixed precision recipe, (2) Persistent layer-norm, (3) Grade scale hysteresis, (4) gradient_as_bucket_view by @erhoo82 :: PR: #3259

Text Normalization / Inverse Text Normalization

Tn clean upsample by @yzhang123 :: PR: #3024
Tn add nn wfst and doc by @yzhang123 :: PR: #3135
Update english tn ckpt by @yzhang123 :: PR: #3143
WFST_tutorial for ITN development by @tbartley94 :: PR: #3128
German TN wfst by @yzhang123 :: PR: #3174
Add ITN Vietnamese by @binh234 :: PR: #3217
WFST TN updates by @ekmb :: PR: #3235
Itn german refactor by @yzhang123 :: PR: #3262
Tn german deterministic by @yzhang123 :: PR: #3308
TN updates by @ekmb :: PR: #3285
Added double digits to EN ITN by @yzhang123 :: PR: #3321
TN_non_deterministic optimized by @ekmb :: PR: #3343
Missing init for TN German by @ekmb :: PR: #3355
Ru TN by @ekmb :: PR: #3390
Update ContextNet models trained on more datasets by @titu1994 :: PR: #3440

NeMo Tools

CTC Segmentation-Citrinet support by @ekmb :: PR: #3279
Updated NumPy SDE requirement by @vsl9 :: PR: #3442

Export

ONNX and TorchScript support for Mixer-TTS by @Oktai15 :: PR: #3082
CTC Conformer fixes for ONNX/TS export by @borisfom :: PR: #3072

Documentation

Merge r1.5.0 bugfixes and doc updates to main by @ericharper :: PR: #3133
Tn add nn wfst and doc by @yzhang123 :: PR: #3135
Add apex into by @PeganovAnton :: PR: #3214
Final merge r1.5.0 bugfixes and doc updates to main by @ericharper :: PR: #3232
Nemo container docker building instruction - merge to main by @fayejf :: PR: #3236
Doc link fixes by @nithinraok :: PR: #3264
French ASR Doc updates by @tbartley94 :: PR: #3322
german asr doc page update by @yzhang123 :: PR: #3325
update docs and replace speakernet with titanet in tutorials by @nithinraok :: PR: #3405
Asr fr by @tbartley94 :: PR: #3404
Update Speech Classificatoin - VAD doc by @fayejf :: PR: #3430
Update speaker diarization docs by @tango4j :: PR: #3419
NMT documentation for bottleneck architecture by @michalivne :: PR: #3464
Add verification helper function and update docs by @nithinraok :: PR: #3514
Prompt tuning documentation by @vadam5 :: PR: #3541
French ASR Doc updates by @tbartley94 :: PR: #3322
German asr doc page update by @yzhang123 :: PR: #3325

Bugfixes

Fixed wrong tgt_length for timing by @michalivne :: PR: #3050
Update nltk version with a CVE fix by @thomasdhc :: PR: #3054
Fix README by @ericharper :: PR: #3070
Transformer Decoder: Fix swapped input name issue by @aklife97 :: PR: #3066
Fixes bugs in collect_tokenizer_dataset_stats.py by @michalivne :: PR: #3060
Attribute is not working in . by @PeganovAnton :: PR: #3099
Merge r1.5.0 bugfixes and doc updates to main by @ericharper :: PR: #3133
A quick fix for issue #3094 index out-of-bound when truncating long text to max_seq_length by @bugface :: PR: #3131
Fixed two typos by @bene-ges :: PR: #3157
Merge r1.5.0 bugfixes to main by @ericharper :: PR: #3173
LJSpeech alignment scripts fixed for latest MFA by @m-toman :: PR: #3177
Add apex into by @PeganovAnton :: PR: #3214
Patch omegaconf for cfg by @fayejf :: PR: #3224
Final merge r1.5.0 bugfixes and doc updates to main by @ericharper :: PR: #3232
CTC Conformer fixes for ONNX/TS export by @borisfom :: PR: #3072
Fix Masked SE for Citrinets + export Limited Context Citrinet by @titu1994 :: PR: #3216
Fix text length type in TTSDataset for beta_binomial_interpolator by @Oktai15 :: PR: #3233
Fix cast type in _se_pool_step_script related functions by @Oktai15 :: PR: #3239
Doc link fixes by @nithinraok :: PR: #3264
Escape chars fix by @ekmb :: PR: #3253
Fix asr output - eval mode by @nithinraok :: PR: #3274
Remove ArrayLike because it is not supported in numpy 1.18 by @PeganovAnton :: PR: #3282
Fix megatron_gpt_ckpt_to_nemo.py with torch distributed by @yaoyu-33 :: PR: #3278
Reduce test time of punctuation and capitalization model by @PeganovAnton :: PR: #3286
Tn en money fix by @yzhang123 :: PR: #3290
Fixing the bucketing_batch_size bug. by @VahidooX :: PR: #3294
Adaptiv fixed positional embeddings by @michalivne :: PR: #3263
Fix specaugment time start for numba kernel by @titu1994 :: PR: #3299
Fix for Stalled ASR training/eval on Pytorch 1.10+ (multigpu/multinode) by @titu1994 :: PR: #3304
Fix bucketing list bug. by @VahidooX :: PR: #3315
Fix MixerTTS types and dimensions by @Oktai15 :: PR: #3330
Fix german and vietnames grammar by @yzhang123 :: PR: #3331
Fix readme to show cmd by @yzhang123 :: PR: #3345
Fix speaker label models training convergence by @nithinraok :: PR: #3354
Tqdm get datasets by @bmwshop :: PR: #3358
Fixed future masking in cross attention of Perceiver by @michalivne :: PR: #3314
Fixed the bug of fixed-size bucketing. by @VahidooX :: PR: #3364
Fix minor problems in punctuation and capitalization model by @PeganovAnton :: PR: #3376
Megatron AMP fix for scheduler step counter by @titu1994 :: PR: #3293
fixed the bug of bucketing when fixed-size batch is used. by @VahidooX :: PR: #3399
TalkNet Fix by @stasbel :: PR: #3092
Fix linear annealing not annealing lr to min_lr by @MaximumEntropy :: PR: #3400
Resume training on SLURM multi-node multi-gpu by @itzsimpl :: PR: #3374
Fix running token classification in multinode setting by @PeganovAnton :: PR: #3413
Fix order of lang checking to ignore input langs by @MaximumEntropy :: PR: #3417
NMT MIM mean variance fix by @michalivne :: PR: #3385
Fix bug for missing variable by @MaximumEntropy :: PR: #3437
Asr patches by @titu1994 :: PR: #3443
Prompt tuning loss mask fix by @vadam5 :: PR: #3438
BioMegatron token classification tutorial fix to be compatible with current Megatron BERT by @yidong72 :: PR: #3435
Fix hysterisis loading by @MaximumEntropy :: PR: #3460
Fix the tutorial notebooks bug by @yidong72 :: PR: #3465
Fix the errors/bugs in ASR with diarization tutorial by @tango4j :: PR: #3461
WFST Punct post fix + punct tutorial fixes by @ekmb :: PR: #3469
Process correctly label ids dataset parameter + standardize type of label ids model attribute + minor changes (error messages, typing) by @PeganovAnton :: PR: #3471
file name fix - Segmentation tutorial by @ekmb :: PR: #3474
Patch fix for the multiple last checkpoints issue by @nithinraok :: PR: #3468
Fix bug with arguments for TalkNet's preprocessor by @Oktai15 :: PR: #3481
Fix description by @PeganovAnton :: PR: #3482
typo fix in diarization notebooks by @nithinraok :: PR: #3480
Fix check...

Contributors

bmwshop, ryanleary, and 34 other contributors

Assets 2

04 Dec 00:00

blisc

v1.5.1

01419c3

NVIDIA Neural Modules 1.5.1

Features

Minor updates to expose speaker id, pitch, and duration on export of FastPitch #3192, #3207

Known Issues

Training of speaker models converge very slowly due to a bug (fixed in main: #3354)
ASR training does not reach adequate WER due to bug in Numba Spec Augment (fixed in main : #3299). For details refer to #3288 (comment) . For a temporary workaround, disable Numba Spec Augment with https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/modules/audio_preprocessing.py#L471 set to False in the config for SpecAugment in the yaml config. The fix will be part of 1.6.0.

Assets 2

20 Nov 01:55

ericharper

v1.5.0

e2d11bb

NVIDIA Neural Modules 1.5.0

Features

Megatron GPT pre-training with tensor model parallelism #2975
NMT encoder and decoder with different hidden size #2856
Logging timing of train/val/test steps #2936
Logging NMT encoder and decoder timing #2956
Logging timing per sentence length and tokenized text statistics #3004
Upgrade to PyTorch Lightning 1.5.0, bfloat support #2975
French Inverse Text Normalization #2921
Bucketing of tarred datasets for ASR models #2999
ASR with diarization #3007
Adding parallel transcribe for ASR models - suppports multi-gpu/multi-node #3017

Documentation Updates

RNNT

Contributors

@ericharper @michalivne @MaximumEntropy @VahidooX @titu1994 @blisc @okuchaiev @tango4j @erastorgueva-nv @fayejf @vadam5 @ekmb @yaoyu-33 @nithinraok @erhoo82 @tbartley94 @PeganovAnton @madhukarkm @yzhang123
(Please let us know if you have contributed to this release and we have missed you here.)

Contributors

titu1994, yzhang123, and 17 other contributors

Assets 2

02 Oct 00:49

ericharper

v1.4.0

0958184

NVIDIA Neural Modules 1.4.0

Features

Improved speaker clustering #2729
Upgrade to NVIDIA PyTorch 21.08 container #2799
RNNT mAES beam search support #2802
Transfer learning for new speakers #2684
Simplify speaker scripts #2777
Perceiver-encoder architecture #2737
Relative paths in tarred datasets #2776
Torch only TTS package #2643
Inverse text normalization for Spanish #2489

Tutorial Notebooks

Duration and pitch control for TTS # 2700

Bug fixes

Fixed max delta generation #2727
Waveglow export #2671, #2699

Contributors

@tango4j @titu1994 @paarthneekhara @nithinraok @michalivne @erastorgueva-nv @borisfom @blisc
(some contributors may not be listed explicitly)

Contributors

titu1994, blisc, and 6 other contributors

Assets 2

Releases: NVIDIA/NeMo

NVIDIA Neural Modules 1.8.0

Known Issues

Container

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

Export

Bugfixes

General Improvements

Contributors

NVIDIA Neural Modules 1.7.2

GPT Bugfixes

T5 XNLI Example

Contributors

NVIDIA Neural Modules 1.7.1

Known Issues

Bugfixes

Contributors

NVIDIA Neural Modules 1.7.0

Known Issues

Container

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

Export

Bugfixes

Improvements

Contributors

NVIDIA Neural Modules 1.6.2

Bug fix

NVIDIA Neural Modules 1.6.1

Bug Fixes

NVIDIA Neural Modules 1.6.0

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

NeMo Tools

Export

Documentation

Bugfixes

Contributors

NVIDIA Neural Modules 1.5.1

Features

Known Issues

NVIDIA Neural Modules 1.5.0

Features

Documentation Updates

Contributors

Contributors

NVIDIA Neural Modules 1.4.0

Features

Tutorial Notebooks

Bug fixes

Contributors

Contributors