Fp16 support for Conformer #4571

bmwshop · 2022-07-21T00:35:14Z

What does this PR do ?

enables support for fp16 training

Add a one line overview of what this PR aims to accomplish.
enables support for fp16 training by forcing MHSA to float32 or bfloat16 if available.

Collection: [Note which collection this PR will affect]
ASR

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below
trainer.precision=16

# Add a code snippet demonstrating how to use this

n/a

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
[X ] Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

titu1994 · 2022-07-21T00:40:12Z

nemo/utils/precision_utils.py

@@ -0,0 +1,40 @@
+# Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.


titu1994 · 2022-07-21T00:42:04Z

nemo/utils/precision_utils.py

+def get_current_precision():
+    """ Determine current precision set by the trainer
+    """
+    if torch.cuda.is_available():


Very poor check. Isn't there some other way ? And also depends on when it is called - prior to init, after model init, after PTL init, during ptl training all give different results here c

nemo/utils/precision_utils.py

titu1994

Thanks! This looks great !

piraka9011 · 2022-07-28T01:24:14Z

Does this resolve the training stability issues?
If so, is there any reason this isn't merged yet or if I could just cherry pick the commit to test in the meantime?

bmwshop · 2022-07-28T06:21:58Z

Does this resolve the training stability issues? If so, is there any reason this isn't merged yet or if I could just cherry pick the commit to test in the meantime?

sorry need a little more time to double check for issues; will provide an update shortly

lgtm-com · 2022-07-29T03:16:12Z

This pull request introduces 1 alert when merging ada77bd into 1d25c90 - view on LGTM.com

new alerts:

1 for Unused import

…6support

lgtm-com · 2022-07-29T06:57:01Z

This pull request introduces 1 alert when merging a1400e6 into 9b07037 - view on LGTM.com

new alerts:

1 for Unused import

lgtm-com · 2022-07-29T18:35:07Z

This pull request introduces 2 alerts when merging 0181ce5 into 2a5516c - view on LGTM.com

new alerts:

2 for Unused import

lgtm-com · 2022-07-29T19:11:41Z

This pull request fixes 1 alert when merging fc06a83 into 2a5516c - view on LGTM.com

fixed alerts:

1 for Unused import

Signed-off-by: Dima Rekesh <[email protected]>

lgtm-com · 2022-07-29T22:40:24Z

This pull request fixes 1 alert when merging c2ae405 into 4fef5dd - view on LGTM.com

fixed alerts:

1 for Unused import

lgtm-com · 2022-07-29T23:51:58Z

This pull request fixes 1 alert when merging eb0571f into 4fef5dd - view on LGTM.com

fixed alerts:

1 for Unused import

lgtm-com · 2022-07-30T00:25:41Z

This pull request fixes 1 alert when merging a6c2336 into 1be2bda - view on LGTM.com

fixed alerts:

1 for Unused import

VahidooX

LGTM!

* bug fix - sample rate was being ignored in vocoder dataset when not loading mel Signed-off-by: Paarth Neekhara <[email protected]> * handled n segments for a different sampling rate than original sampling rate Signed-off-by: Paarth Neekhara <[email protected]> * Added case for n_segments 0, warning for n_segments greater than file length Signed-off-by: Paarth Neekhara <[email protected]> * Fix metric setup for finetuning without a test set (NVIDIA#4585) * Fix metric setup for finetuning without a test set Signed-off-by: MaximumEntropy <[email protected]> * Fix log key Signed-off-by: MaximumEntropy <[email protected]> * Remove pdb Signed-off-by: MaximumEntropy <[email protected]> * Minor Signed-off-by: MaximumEntropy <[email protected]> * Fix skip train ds building while finetuning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> * r1.10.0 MegaMolBART Compatibility (NVIDIA#4603) * 1. Added vocab_size property to RegExTokenizer. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed passing hiddens directly. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in encoder outputs. Signed-off-by: Micha Livne <[email protected]> * 1. Added comments. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added automatic mapping of kwargs to args in forward. Signed-off-by: Micha Livne <[email protected]> * 1. Added encode function. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. PP and TP works (but not together) Signed-off-by: Micha Livne <[email protected]> * 1. Separated get_forward_output_only_func_encode and get_forward_output_only_func_decode. Signed-off-by: Micha Livne <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * Set headscale false (NVIDIA#4364) Signed-off-by: MaximumEntropy <[email protected]> * Add wandb as dependency (NVIDIA#4365) Signed-off-by: smajumdar <[email protected]> * Raise trainer error (NVIDIA#4356) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Set headscale false (NVIDIA#4364) (NVIDIA#4366) Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: smajumdar <[email protected]> * Finetuning changes for BART (NVIDIA#4003) * Temp Signed-off-by: MaximumEntropy <[email protected]> * Checkpoint converter to nemo for bart Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * Make position embedding expansion specific to a batch to avoid checkpoint size mismatches (NVIDIA#4357) * Style Signed-off-by: MaximumEntropy <[email protected]> * Fix logging warning Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> * 1. Added return logits to validation. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed unkown token during sampling. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed RegExTokenizer loading. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed ckpt file with samples int(0). Signed-off-by: Micha Livne <[email protected]> * 1. Fixed regex tokenizer. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed allowing enc_tokens to be None. Signed-off-by: Micha Livne <[email protected]> * 1. Added ability to ignore tokens by id during decode. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed regex tokenizer .nemo loading issue. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed RegEx test. Signed-off-by: Micha Livne <[email protected]> * r1.10.0 untie embeddings weights (NVIDIA#4519) * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added independent decoder embeddings, and independent decoder token_head. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in yaml config. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed initialization. Signed-off-by: Micha Livne <[email protected]> * 1. Added tests for untied embeddings and decoder token head. Signed-off-by: Micha Livne <[email protected]> * 1. Updated share_word_embeddings to share_token_embeddings. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed error in __del__ when TextMemMapDataset fails to build. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed comments. Signed-off-by: Micha Livne <[email protected]> * 1.Made method private. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed config names. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed alerts and style. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed PP, TP, PP+TP still fails. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: ericharper <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Update megatron t5 interface to dialogue (NVIDIA#4626) * G2P Aligner (NVIDIA#4604) * Aligner inference notebook in progress. Preprocessing, forward, attn viz Signed-off-by: Jocelyn Huang <[email protected]> * Hard attn, duration extraction, distance matrix Signed-off-by: Jocelyn Huang <[email protected]> * Started: phoneme disambiguation using Aligner distance matrix Signed-off-by: Jocelyn Huang <[email protected]> * Decouple encode_from_g2p() from phoneme tokenizer encode() for disambiguation inference Signed-off-by: Jocelyn Huang <[email protected]> * Aligner G2P disambiguation using mean L2 embedding distance Signed-off-by: Jocelyn Huang <[email protected]> * Rename aligner inference notebook Signed-off-by: Jocelyn Huang <[email protected]> * Header text for Aligner notebook, formatting Signed-off-by: Jocelyn Huang <[email protected]> * Aligner notebook formatting, header, license updates Signed-off-by: Jocelyn Huang <[email protected]> * Aligner G2P disambiguation script draft Signed-off-by: Jocelyn Huang <[email protected]> * Aligner G2P disambiguation script finished Signed-off-by: Jocelyn Huang <[email protected]> * Remove normalization step to fix words with apostrophes (G2P) Signed-off-by: Jocelyn Huang <[email protected]> * Fix normalization args for G2P disambiguation Signed-off-by: Jocelyn Huang <[email protected]> * Allow str to be passed in for supp data, add 'text_normalized' as manifest option Signed-off-by: Jocelyn Huang <[email protected]> * Aligner G2P script fixes: normalization, tokenization, add brackets around tokens, etc. Signed-off-by: Jocelyn Huang <[email protected]> * Only disambiguate words in the given heteronyms list Signed-off-by: Jocelyn Huang <[email protected]> * Filtering option for disambiguation script Signed-off-by: Jocelyn Huang <[email protected]> * Add confidence thresholding, add PASTY to cmudict entries Signed-off-by: Jocelyn Huang <[email protected]> * TTS Aligner tutorial updates to generic path text Signed-off-by: Jocelyn Huang <[email protected]> * Add confidence to aligner_g2p.py run example Signed-off-by: Jocelyn Huang <[email protected]> * Move avg word distance function to Aligner encoder, add docstring, fix license Signed-off-by: Jocelyn Huang <[email protected]> * Aligner Inference notebook updates (link to sample, resources added) Signed-off-by: Jocelyn Huang <[email protected]> * Fix HF check for model card info (NVIDIA#4628) Signed-off-by: smajumdar <[email protected]> * Tiny VAD refactoring for postprocessing (NVIDIA#4625) * binarization start index Signed-off-by: fayejf <[email protected]> * fix frame len Signed-off-by: fayejf <[email protected]> * style fix Signed-off-by: fayejf <[email protected]> * rame UNIT_FRAME_LEN Signed-off-by: fayejf <[email protected]> * update overlap script and fix lgtm Signed-off-by: fayejf <[email protected]> * style fi Signed-off-by: fayejf <[email protected]> * Fix ITN pt (NVIDIA#4623) Signed-off-by: Guilherme Steinmann <[email protected]> * [TN] bug fix "hundred" in Audio-based, added method so split text in sentences (NVIDIA#4610) * fix duplex inference with grammars Signed-off-by: ekmb <[email protected]> * fix hundred TN audio bug, add split text Signed-off-by: ekmb <[email protected]> * fix header year Signed-off-by: ekmb <[email protected]> * style fix Signed-off-by: ekmb <[email protected]> * exclude I from roman-ordinal form Signed-off-by: ekmb <[email protected]> * fix graph_with_and Signed-off-by: ekmb <[email protected]> * fix tests Signed-off-by: ekmb <[email protected]> * fix split regex Signed-off-by: ekmb <[email protected]> * fix warning Signed-off-by: ekmb <[email protected]> * [Text Processing] G2P for OOV and heteronyms (NVIDIA#4624) * add models Signed-off-by: ekmb <[email protected]> * fix header and t5 inference Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix lgtm Signed-off-by: ekmb <[email protected]> * review fixes Signed-off-by: ekmb <[email protected]> * fix if/else and removed unused imports Signed-off-by: ekmb <[email protected]> * replace ModelPT with G2PModel Signed-off-by: ekmb <[email protected]> * black Signed-off-by: ekmb <[email protected]> * add missing headers Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * fix TRANSFORMERS_OFFLINE flag Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * Update README.rst * Fp16 support for Conformer (NVIDIA#4571) * adding auto-select best precision for mhsa * cleanup * moving mhsa32 check into mhsa * switching to torch.cuda.is_bf16_supported() * now using torch.is_autocast_enabled() * added to non rel mhsa * only forcing 32bit subsampling if using bf16 * removing unused imports * moving contexts to utils Signed-off-by: Dima Rekesh <[email protected]> * formatting Signed-off-by: Dima Rekesh <[email protected]> * naming Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Maximum sample-based training for Megatron NMT and Text Memmap based Seq2seq Pre-training (NVIDIA#4396) * Update blendable dataset, and refactor seq2seq data Signed-off-by: MaximumEntropy <[email protected]> * Blendable dataset with binarized mmap working Signed-off-by: MaximumEntropy <[email protected]> * Pass seed from cfg to dataset Signed-off-by: MaximumEntropy <[email protected]> * Fix multilingual setup Signed-off-by: MaximumEntropy <[email protected]> * Add on epoch start reconfiguration Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Update tokenizer creation for multilingual Signed-off-by: MaximumEntropy <[email protected]> * Tmp Signed-off-by: MaximumEntropy <[email protected]> * Update NMT script Signed-off-by: MaximumEntropy <[email protected]> * Remove unused import Signed-off-by: MaximumEntropy <[email protected]> * Update training script Signed-off-by: MaximumEntropy <[email protected]> * Log consumed samples Signed-off-by: MaximumEntropy <[email protected]> * Logging on val epoch end Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Remove redundant print Signed-off-by: MaximumEntropy <[email protected]> * Ckpt averaging for non model parallel megatron models Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> * Update error message Signed-off-by: MaximumEntropy <[email protected]> * Style Signed-off-by: MaximumEntropy <[email protected]> * Remove check Signed-off-by: MaximumEntropy <[email protected]> * Restore fixes Signed-off-by: MaximumEntropy <[email protected]> * Remove ipdb Signed-off-by: MaximumEntropy <[email protected]> * Fixes Signed-off-by: MaximumEntropy <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Testing a simple solution Signed-off-by: Micha Livne <[email protected]> * 1. Fixed. Seems to work. Need to validate. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in CSV and text memmap toMEgatron encoder-decoder Signed-off-by: Micha Livne <[email protected]> * 1. Added support in CSV. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. 2. Fixed bugs. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed bugs. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated yaml. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed warnings. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed a bug. Signed-off-by: Micha Livne <[email protected]> * 1. Added a test for text_memmap Signed-off-by: Micha Livne <[email protected]> * Fix retro Signed-off-by: MaximumEntropy <[email protected]> * add docstrings Signed-off-by: MaximumEntropy <[email protected]> * Minor Signed-off-by: MaximumEntropy <[email protected]> * Uncomment CI tests and fix existing gpt ci tests Signed-off-by: MaximumEntropy <[email protected]> * Tmp Signed-off-by: MaximumEntropy <[email protected]> * Remove max step hacking and move on_train_batch_end to base model Signed-off-by: MaximumEntropy <[email protected]> * Empty Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Eric Harper <[email protected]> * NeMo Megatron Doc updates1 (NVIDIA#4633) * Work on NeMo Megatron OSS documentation Signed-off-by: Oleksii Kuchaiev <[email protected]> * NeMo Megatron doc updates Signed-off-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Jocelyn <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Guilherme Steinmann <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Micha Livne <[email protected]>

* adding auto-select best precision for mhsa * cleanup * moving mhsa32 check into mhsa * switching to torch.cuda.is_bf16_supported() * now using torch.is_autocast_enabled() * added to non rel mhsa * only forcing 32bit subsampling if using bf16 * removing unused imports * moving contexts to utils Signed-off-by: Dima Rekesh <[email protected]> * formatting Signed-off-by: Dima Rekesh <[email protected]> * naming Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: David Mosallanezhad <[email protected]>

* adding auto-select best precision for mhsa * cleanup * moving mhsa32 check into mhsa * switching to torch.cuda.is_bf16_supported() * now using torch.is_autocast_enabled() * added to non rel mhsa * only forcing 32bit subsampling if using bf16 * removing unused imports * moving contexts to utils Signed-off-by: Dima Rekesh <[email protected]> * formatting Signed-off-by: Dima Rekesh <[email protected]> * naming Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Anas Abou Allaban <[email protected]>

* adding auto-select best precision for mhsa * cleanup * moving mhsa32 check into mhsa * switching to torch.cuda.is_bf16_supported() * now using torch.is_autocast_enabled() * added to non rel mhsa * only forcing 32bit subsampling if using bf16 * removing unused imports * moving contexts to utils Signed-off-by: Dima Rekesh <[email protected]> * formatting Signed-off-by: Dima Rekesh <[email protected]> * naming Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Hainan Xu <[email protected]>

Dima Rekesh added 2 commits July 20, 2022 16:16

adding auto-select best precision for mhsa

59bdfde

cleanup

a9aa289

bmwshop requested review from titu1994 and VahidooX July 21, 2022 00:35

titu1994 requested changes Jul 21, 2022

View reviewed changes

artbataev reviewed Jul 21, 2022

View reviewed changes

nemo/utils/precision_utils.py Outdated Show resolved Hide resolved

VahidooX changed the title ~~Fp16support~~ Fp16 support for Conformer Jul 26, 2022

Dima Rekesh added 2 commits July 26, 2022 15:25

moving mhsa32 check into mhsa

03499b3

switching to torch.cuda.is_bf16_supported()

9e49952

titu1994 previously approved these changes Jul 26, 2022

View reviewed changes

now using torch.is_autocast_enabled()

6f8160f

bmwshop dismissed titu1994’s stale review via 6f8160f July 29, 2022 03:03

titu1994 previously approved these changes Jul 29, 2022

View reviewed changes

Merge branch 'main' into fp16support

ada77bd

Dima Rekesh added 2 commits July 28, 2022 23:42

added to non rel mhsa

1c8b3a0

Merge branch 'fp16support' of https://github.com/nvidia/nemo into fp1…

a1400e6

…6support

bmwshop dismissed titu1994’s stale review via a1400e6 July 29, 2022 06:43

only forcing 32bit subsampling if using bf16

0181ce5

removing unused imports

fc06a83

titu1994 previously approved these changes Jul 29, 2022

View reviewed changes

moving contexts to utils

fc5ef0e

Signed-off-by: Dima Rekesh <[email protected]>

bmwshop dismissed titu1994’s stale review via fc5ef0e July 29, 2022 22:19

formatting

c2ae405

Signed-off-by: Dima Rekesh <[email protected]>

Merge branch 'main' into fp16support

eb0571f

Dima Rekesh and others added 2 commits July 29, 2022 17:11

naming

0245f21

Merge branch 'main' into fp16support

a6c2336

VahidooX approved these changes Jul 30, 2022

View reviewed changes

bmwshop merged commit 21cf961 into main Jul 30, 2022

bmwshop deleted the fp16support branch July 30, 2022 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fp16 support for Conformer #4571

Fp16 support for Conformer #4571

bmwshop commented Jul 21, 2022

titu1994 Jul 21, 2022

titu1994 Jul 21, 2022

titu1994 left a comment

piraka9011 commented Jul 28, 2022

bmwshop commented Jul 28, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 30, 2022

VahidooX left a comment

		@@ -0,0 +1,40 @@
		# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.

Fp16 support for Conformer #4571

Fp16 support for Conformer #4571

Conversation

bmwshop commented Jul 21, 2022

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

Who can review?

Additional Information

titu1994 Jul 21, 2022

Choose a reason for hiding this comment

titu1994 Jul 21, 2022

Choose a reason for hiding this comment

titu1994 left a comment

Choose a reason for hiding this comment

piraka9011 commented Jul 28, 2022

bmwshop commented Jul 28, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 30, 2022

VahidooX left a comment

Choose a reason for hiding this comment