Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NMT Perceiver Encoder #2621

Merged
merged 605 commits into from
Aug 11, 2021
Merged

NMT Perceiver Encoder #2621

merged 605 commits into from
Aug 11, 2021

Conversation

michalivne
Copy link
Collaborator

@michalivne michalivne commented Aug 5, 2021

This PR separates the learning framework (i.e., loss) from the architecture, and adds a new Perceiver encoder bottleneck architecture.

Summary

A few major changes are introduced here:

  1. Separating the learning framework (i.e., loss, or model type) from the architecture of the encoder / decoder.
  2. The bottleneck encoder transformer can now be constructed using get_nemo_transformer method.
  3. Supported learning frameworks:
  4. Supported encoder architectures:

The changes allow to easily select the architecture via configuration. Specifically, using a shared base YAML configuration and overriding flags during experimentation allows easy comparison of bottleneck and non-bottleneck architectures. Separating the loss from the architecture leads to a cleaner code which is easier to understand.

YAML Configuration

The following configurations in YAML config allows to control the learning framework and architecture:

model:
  ...
  model_type: 'nll' # learning (i.e., loss) type: nll (i.e., cross-entropy/auto-encoder), mim, vae (see description above)
  min_logv: -6 # minimal allowed log variance for mim
  latent_size: -1 # dimension of latent (projected from hidden) -1 will take value of hidden size
  non_recon_warmup_batches: 200000 # warm-up steps for mim, and vae losses
  recon_per_token: true # when false reconstruction is computed per sample, not per token
...
  encoder:
    library: nemo
    ...
    arch: seq2seq # seq2seq, bridge, perceiver (see description above)
    hidden_steps: 32 # fixed number of hidden steps
    hidden_blocks: 1 # number of repeat blocks (see classes for description)
    hidden_init_method: default # see classes for available values

  decoder:
    library: nemo
    ...
    arch: seq2seq # currently only seq2seq is supported

Detailed description of config parameters:

  • model.encoder.arch=seq2seq
    • model.encoder.hidden_steps is ignored
    • model.encoder.hidden_blocks is ignored
    • model.encoder.hidden_init_method is ignored
  • model.encoder.arch=bridge
    • model.encoder.hidden_steps: input is projected to the specified fixed steps
    • model.encoder.hidden_blocks: number of encoder blocks to repeat after attention bridge projection
    • model.encoder.hidden_init_method:
      • enc_shared (default) - apply encoder to inputs, than attention bridge, followed by hidden_blocks number of the same encoder (pre and post encoders share parameters)
      • identity - apply attention bridge to inputs, followed by hidden_blocks number of the same encoder
      • enc - similar to enc_shared but the initial encoder has independent parameters
  • model.encoder.arch=perceiver
    • model.encoder.hidden_steps: input is projected to the specified fixed steps
    • model.encoder.hidden_blocks: number of cross-attention + self-attention blocks to repeat after initialization block (all self-attention and cross-attention share parameters)
    • model.encoder.hidden_init_method:
      • params (default) - hidden state is initialized with learned parameters followed by cross-attention with independent parameters
      • bridge - hidden state is initialized with an attention bridge

Usage Example

NOTE: enc_dec_nmt-bottleneck.py must be used. enc_dec_nmt.py does not support the required configuration to run properly and will likely fail.

python -- enc_dec_nmt-bottleneck.py \
      --config-path=conf \
      --config-name=aayn_bottleneck \
      ...
      model.model_type=nll \
      model.non_recon_warmup_batches=7500 \
      model.encoder.arch=perceiver \
      model.encoder.hidden_steps=32 \
      model.encoder.hidden_blocks=2 \
      model.encoder.hidden_init_method=params \
      ...

Additional Info

  • Classes TransformerBottleneckEncoderNM, TransformerBottleneckDecoderNM, and corresponding config classes NeMoTransformerBottleneckConfig, NeMoTransformerBottleneckEncoderConfig, NeMoTransformerBottleneckDecoderConfig were added to nemo/collections/nlp/modules/common/transformer/transformer_bottleneck.py
  • Class PerceiverEncoder was added to nemo/collections/nlp/modules/common/transformer/perceiver_encoders.py
  • Class BridgeEncoder was added to nemo/collections/nlp/modules/common/transformer/bridge_encoders.py
  • Classes TransformerBottleneckEncoderNM and TransformerEncoderNM test for src/tgt ids in the tokenizer valid range.

okuchaiev and others added 30 commits June 8, 2021 16:49
Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: smajumdar <[email protected]>

Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <[email protected]>

* ddp translate GPU allocation fix

Signed-off-by: AlexGrinch <[email protected]>

* map_location instead of set_device

Signed-off-by: AlexGrinch <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <[email protected]>

* shallow fusion init commit

Signed-off-by: AlexGrinch <[email protected]>

* debug info removed

Signed-off-by: AlexGrinch <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* upper bound hydra

Signed-off-by: ericharper <[email protected]>

* upper bound hydra

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>
…#2320)

* add jenkins test, refactoring

Signed-off-by: ekmb <[email protected]>

* update test

Signed-off-by: ekmb <[email protected]>

* fix new test

Signed-off-by: ekmb <[email protected]>

* add serial to the default normalizer, add tests

Signed-off-by: ekmb <[email protected]>

* manifest test added

Signed-off-by: ekmb <[email protected]>

* expose more params, new test cases

Signed-off-by: ekmb <[email protected]>

* fix jenkins, serial clean, exclude range from cardinal

Signed-off-by: ekmb <[email protected]>

* jenkins

Signed-off-by: ekmb <[email protected]>

* jenkins dollar sign format

Signed-off-by: ekmb <[email protected]>

* jenkins

Signed-off-by: ekmb <[email protected]>

* jenkins dollar sign format

Signed-off-by: ekmb <[email protected]>

* addressed review comments

Signed-off-by: ekmb <[email protected]>

* fix decimal in measure

Signed-off-by: ekmb <[email protected]>

* move serial in cardinal

Signed-off-by: ekmb <[email protected]>

* sh tests init

Signed-off-by: ekmb <[email protected]>

* sparrowhawk container tests support added

Signed-off-by: ekmb <[email protected]>

* add post process to normalize.py, update tests

Signed-off-by: ekmb <[email protected]>

* remove duplication

Signed-off-by: ekmb <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* Update ranges

Signed-off-by: smajumdar <[email protected]>

* Updates for Hydra and OmegaConf updates

Signed-off-by: smajumdar <[email protected]>

* Style fixes

Signed-off-by: smajumdar <[email protected]>

* Correct tests and revert patch for model utils

Signed-off-by: smajumdar <[email protected]>

* Correct docstring

Signed-off-by: smajumdar <[email protected]>

* Revert unnecessary change

Signed-off-by: smajumdar <[email protected]>

* Revert unnecessary change

Signed-off-by: smajumdar <[email protected]>

* Guard scheduler for None

Signed-off-by: smajumdar <[email protected]>

* default to 0.0 if bpe_dropout is None

Signed-off-by: ericharper <[email protected]>

* Correctly log class that was restored

Signed-off-by: smajumdar <[email protected]>

* Root patch *bpe_dropout

Signed-off-by: smajumdar <[email protected]>

Co-authored-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Jason <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* Update container version

Signed-off-by: smajumdar <[email protected]>

* Temporarily change export format of waveglow

Signed-off-by: smajumdar <[email protected]>

* Add conda update for numba

Signed-off-by: smajumdar <[email protected]>

* Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests

Signed-off-by: smajumdar <[email protected]>

* Correct order of numba minimum verion, remove wrong flag from test

Signed-off-by: smajumdar <[email protected]>

* Double test of cuda numba

Signed-off-by: smajumdar <[email protected]>

* Double test of cuda numba

Signed-off-by: smajumdar <[email protected]>

* Enable RNNT tests

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* upper cased date support

Signed-off-by: ekmb <[email protected]>

* update whitelist, change roman weights

Signed-off-by: ekmb <[email protected]>

* docstrings, space fix, init file

Signed-off-by: ekmb <[email protected]>

* lgtm

Signed-off-by: ekmb <[email protected]>

* fraction with measure class

Signed-off-by: ekmb <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* Add ASR CTC Language finetuning notebook

Signed-off-by: smajumdar <[email protected]>

* Add to documentation

Signed-off-by: smajumdar <[email protected]>

* Improve documentation

Signed-off-by: smajumdar <[email protected]>

* Correct name of the dataset

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* sgdqa update data directories for testing

Signed-off-by: Yang Zhang <[email protected]>

* fix syntax

Signed-off-by: Yang Zhang <[email protected]>

* check if data dir exists

Signed-off-by: Yang Zhang <[email protected]>

* fix

Signed-off-by: Yang Zhang <[email protected]>

* adding pretrained model

Signed-off-by: Yang Zhang <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* Added export document

Signed-off-by: Boris Fomitchev <[email protected]>

* Addressed review comments

Signed-off-by: Boris Fomitchev <[email protected]>

Co-authored-by: Eric Harper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* Update model card info

Signed-off-by: smajumdar <[email protected]>

* Cleanup Docs

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* add megatron encoder

Signed-off-by: ericharper <[email protected]>

* added megatron to get_nmt_tokenizer

Signed-off-by: ericharper <[email protected]>

* add vocab_size and hidden_size to megatron bert

Signed-off-by: ericharper <[email protected]>

* add megatron encoder module

Signed-off-by: ericharper <[email protected]>

* fixed horrible typo

Signed-off-by: ericharper <[email protected]>

* fix typo and add default

Signed-off-by: ericharper <[email protected]>

* updating nlp overrides for mp nmt

Signed-off-by: ericharper <[email protected]>

* move some logic back to nlpmodel from overrides

Signed-off-by: ericharper <[email protected]>

* add checkpoint_file property

Signed-off-by: ericharper <[email protected]>

* fix property

Signed-off-by: ericharper <[email protected]>

* num_tokentypes=0

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* find_unused_parameters=True

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* style

Signed-off-by: ericharper <[email protected]>

* get instead of pop

Signed-off-by: ericharper <[email protected]>

* remove token type ids from megatron input example

Signed-off-by: ericharper <[email protected]>

* pop vocab_size

Signed-off-by: ericharper <[email protected]>

* fix checkpointing for model parallel

Signed-off-by: ericharper <[email protected]>

* fix bug in non model parallel

Signed-off-by: ericharper <[email protected]>

* convert cfg.trainer to dict

Signed-off-by: ericharper <[email protected]>

* make num_tokentypes configurable for nmt

Signed-off-by: ericharper <[email protected]>

* update checkpoint_file when using named megatron model in nemo

Signed-off-by: ericharper <[email protected]>

* make vocab_file configurable

Signed-off-by: ericharper <[email protected]>

* dataclass can't have mutable default

Signed-off-by: ericharper <[email protected]>

* style

Signed-off-by: ericharper <[email protected]>

* unused imports

Signed-off-by: ericharper <[email protected]>

* revert input example

Signed-off-by: ericharper <[email protected]>

* check that checkpoint version is not None

Signed-off-by: ericharper <[email protected]>

* add mp jenkins test

Signed-off-by: ericharper <[email protected]>

* update docstring

Signed-off-by: ericharper <[email protected]>

* add docs for pretrained encoders with nemo nmt

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
* Added a notebook with best practices for telephony speech

* Added datasets detaiils

* Added training recommendations

* Emptied out cells with results

* Added tutorial to docs

Signed-off-by: jbalam <[email protected]>

* Addressed review comments

Signed-off-by: jbalam <[email protected]>

* Added a line to note original sampling rate of an4

Signed-off-by: jbalam <[email protected]>

* Made changes suggested in review

Signed-off-by: jbalam <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
Signed-off-by: Micha Livne <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Aug 11, 2021

This pull request introduces 1 alert when merging 53cce66 into 4abe5d5 - view on LGTM.com

new alerts:

  • 1 for Unreachable code

Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great PR! Thanks for all of the changes.

Copy link
Contributor

@MaximumEntropy MaximumEntropy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

examples/nlp/machine_translation/conf/aayn_bottleneck.yaml Outdated Show resolved Hide resolved
@MaximumEntropy MaximumEntropy merged commit 564c67d into NVIDIA:main Aug 11, 2021
blisc added a commit to blisc/NeMo that referenced this pull request Aug 12, 2021
* upper bound for webdataset

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* Correct Dockerfile

Signed-off-by: smajumdar <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update readmes

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update README (NVIDIA#2332)

Signed-off-by: Oleksii Kuchaiev <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* ddp translate GPU allocation fix (NVIDIA#2312)

* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <[email protected]>

* ddp translate GPU allocation fix

Signed-off-by: AlexGrinch <[email protected]>

* map_location instead of set_device

Signed-off-by: AlexGrinch <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Shallow fusion (NVIDIA#2315)

* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <[email protected]>

* shallow fusion init commit

Signed-off-by: AlexGrinch <[email protected]>

* debug info removed

Signed-off-by: AlexGrinch <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337)

* upper bound hydra

Signed-off-by: ericharper <[email protected]>

* upper bound hydra

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* update version number

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update package version

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320)

* add jenkins test, refactoring

Signed-off-by: ekmb <[email protected]>

* update test

Signed-off-by: ekmb <[email protected]>

* fix new test

Signed-off-by: ekmb <[email protected]>

* add serial to the default normalizer, add tests

Signed-off-by: ekmb <[email protected]>

* manifest test added

Signed-off-by: ekmb <[email protected]>

* expose more params, new test cases

Signed-off-by: ekmb <[email protected]>

* fix jenkins, serial clean, exclude range from cardinal

Signed-off-by: ekmb <[email protected]>

* jenkins

Signed-off-by: ekmb <[email protected]>

* jenkins dollar sign format

Signed-off-by: ekmb <[email protected]>

* jenkins

Signed-off-by: ekmb <[email protected]>

* jenkins dollar sign format

Signed-off-by: ekmb <[email protected]>

* addressed review comments

Signed-off-by: ekmb <[email protected]>

* fix decimal in measure

Signed-off-by: ekmb <[email protected]>

* move serial in cardinal

Signed-off-by: ekmb <[email protected]>

* sh tests init

Signed-off-by: ekmb <[email protected]>

* sparrowhawk container tests support added

Signed-off-by: ekmb <[email protected]>

* add post process to normalize.py, update tests

Signed-off-by: ekmb <[email protected]>

* remove duplication

Signed-off-by: ekmb <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update notebooks to 1.0.2 release (NVIDIA#2338)

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update ranges for omegaconf and hydra (NVIDIA#2336)

* Update ranges

Signed-off-by: smajumdar <[email protected]>

* Updates for Hydra and OmegaConf updates

Signed-off-by: smajumdar <[email protected]>

* Style fixes

Signed-off-by: smajumdar <[email protected]>

* Correct tests and revert patch for model utils

Signed-off-by: smajumdar <[email protected]>

* Correct docstring

Signed-off-by: smajumdar <[email protected]>

* Revert unnecessary change

Signed-off-by: smajumdar <[email protected]>

* Revert unnecessary change

Signed-off-by: smajumdar <[email protected]>

* Guard scheduler for None

Signed-off-by: smajumdar <[email protected]>

* default to 0.0 if bpe_dropout is None

Signed-off-by: ericharper <[email protected]>

* Correctly log class that was restored

Signed-off-by: smajumdar <[email protected]>

* Root patch *bpe_dropout

Signed-off-by: smajumdar <[email protected]>

Co-authored-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update FastPitch Export (NVIDIA#2355)

Signed-off-by: Jason <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* update out_dir to not collide (NVIDIA#2358)

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update container version to 21.05 (NVIDIA#2309)

* Update container version

Signed-off-by: smajumdar <[email protected]>

* Temporarily change export format of waveglow

Signed-off-by: smajumdar <[email protected]>

* Add conda update for numba

Signed-off-by: smajumdar <[email protected]>

* Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests

Signed-off-by: smajumdar <[email protected]>

* Correct order of numba minimum verion, remove wrong flag from test

Signed-off-by: smajumdar <[email protected]>

* Double test of cuda numba

Signed-off-by: smajumdar <[email protected]>

* Double test of cuda numba

Signed-off-by: smajumdar <[email protected]>

* Enable RNNT tests

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Text Normalization Update (NVIDIA#2356)

* upper cased date support

Signed-off-by: ekmb <[email protected]>

* update whitelist, change roman weights

Signed-off-by: ekmb <[email protected]>

* docstrings, space fix, init file

Signed-off-by: ekmb <[email protected]>

* lgtm

Signed-off-by: ekmb <[email protected]>

* fraction with measure class

Signed-off-by: ekmb <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346)

* Add ASR CTC Language finetuning notebook

Signed-off-by: smajumdar <[email protected]>

* Add to documentation

Signed-off-by: smajumdar <[email protected]>

* Improve documentation

Signed-off-by: smajumdar <[email protected]>

* Correct name of the dataset

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Correct colab link to notebook (NVIDIA#2366)

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* sgdqa update data directories for testing (NVIDIA#2323)

* sgdqa update data directories for testing

Signed-off-by: Yang Zhang <[email protected]>

* fix syntax

Signed-off-by: Yang Zhang <[email protected]>

* check if data dir exists

Signed-off-by: Yang Zhang <[email protected]>

* fix

Signed-off-by: Yang Zhang <[email protected]>

* adding pretrained model

Signed-off-by: Yang Zhang <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Added documentation for export() (NVIDIA#2330)

* Added export document

Signed-off-by: Boris Fomitchev <[email protected]>

* Addressed review comments

Signed-off-by: Boris Fomitchev <[email protected]>

Co-authored-by: Eric Harper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update Citrinet model card info (NVIDIA#2369)

* Update model card info

Signed-off-by: smajumdar <[email protected]>

* Cleanup Docs

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* [NMT] Model Parallel Megatron Encoders (NVIDIA#2238)

* add megatron encoder

Signed-off-by: ericharper <[email protected]>

* added megatron to get_nmt_tokenizer

Signed-off-by: ericharper <[email protected]>

* add vocab_size and hidden_size to megatron bert

Signed-off-by: ericharper <[email protected]>

* add megatron encoder module

Signed-off-by: ericharper <[email protected]>

* fixed horrible typo

Signed-off-by: ericharper <[email protected]>

* fix typo and add default

Signed-off-by: ericharper <[email protected]>

* updating nlp overrides for mp nmt

Signed-off-by: ericharper <[email protected]>

* move some logic back to nlpmodel from overrides

Signed-off-by: ericharper <[email protected]>

* add checkpoint_file property

Signed-off-by: ericharper <[email protected]>

* fix property

Signed-off-by: ericharper <[email protected]>

* num_tokentypes=0

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* find_unused_parameters=True

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* style

Signed-off-by: ericharper <[email protected]>

* get instead of pop

Signed-off-by: ericharper <[email protected]>

* remove token type ids from megatron input example

Signed-off-by: ericharper <[email protected]>

* pop vocab_size

Signed-off-by: ericharper <[email protected]>

* fix checkpointing for model parallel

Signed-off-by: ericharper <[email protected]>

* fix bug in non model parallel

Signed-off-by: ericharper <[email protected]>

* convert cfg.trainer to dict

Signed-off-by: ericharper <[email protected]>

* make num_tokentypes configurable for nmt

Signed-off-by: ericharper <[email protected]>

* update checkpoint_file when using named megatron model in nemo

Signed-off-by: ericharper <[email protected]>

* make vocab_file configurable

Signed-off-by: ericharper <[email protected]>

* dataclass can't have mutable default

Signed-off-by: ericharper <[email protected]>

* style

Signed-off-by: ericharper <[email protected]>

* unused imports

Signed-off-by: ericharper <[email protected]>

* revert input example

Signed-off-by: ericharper <[email protected]>

* check that checkpoint version is not None

Signed-off-by: ericharper <[email protected]>

* add mp jenkins test

Signed-off-by: ericharper <[email protected]>

* update docstring

Signed-off-by: ericharper <[email protected]>

* add docs for pretrained encoders with nemo nmt

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Add notebook with recommendations for 8 kHz speech (NVIDIA#2326)

* Added a notebook with best practices for telephony speech

* Added datasets detaiils

* Added training recommendations

* Emptied out cells with results

* Added tutorial to docs

Signed-off-by: jbalam <[email protected]>

* Addressed review comments

Signed-off-by: jbalam <[email protected]>

* Added a line to note original sampling rate of an4

Signed-off-by: jbalam <[email protected]>

* Made changes suggested in review

Signed-off-by: jbalam <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Working on bottleneck transformers.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on bottleneck transformers.

* 1. Done cleaning code of bottleneck transformers.
2. Ready to test.

Signed-off-by: Micha Livne <[email protected]>

* 1. Done cleaning code of bottleneck transformers.
2. Ready to test.

* 1. Working on training script.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on training script.

* 1. Updated config class name.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated config class name.

* 1. Training script is ready to be tested.

Signed-off-by: Micha Livne <[email protected]>

* 1. Training script is ready to be tested.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* Add FastEmit support for RNNT Losses (NVIDIA#2374)

* Temp commit

Signed-off-by: smajumdar <[email protected]>

* Initial code for fastemit forward pass

Signed-off-by: smajumdar <[email protected]>

* Correct return reg value

Signed-off-by: smajumdar <[email protected]>

* Initial cpu impl

Signed-off-by: smajumdar <[email protected]>

* Try gpu impl

Signed-off-by: smajumdar <[email protected]>

* Try gpu impl

Signed-off-by: smajumdar <[email protected]>

* Correct few impl

Signed-off-by: smajumdar <[email protected]>

* Update fastemit scaling

Signed-off-by: smajumdar <[email protected]>

* Cleanup fastemit

Signed-off-by: smajumdar <[email protected]>

* Finalize FastEmit regularization PR

Signed-off-by: smajumdar <[email protected]>

* Refactor code to support fastemit regularization

Signed-off-by: smajumdar <[email protected]>

Co-authored-by: Samuel Kriman <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Fixed bugs.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed bugs.

* 1. Fixed missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed missing import.

* 1. Fixed support in seq2seq-br.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed support in seq2seq-br.

* 1. Added NLPDDPPlugin.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added NLPDDPPlugin.

* fix bugs in hifigan code (NVIDIA#2392)

Signed-off-by: Oktai Tatanov <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update setup.py (NVIDIA#2394)

Signed-off-by: Jason <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Updated to support multi-node training.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added comments.

Signed-off-by: Micha Livne <[email protected]>

* 1. MTBottleneckModel is in its own file mt_enc_dec_bottleneck_model.

Signed-off-by: Micha Livne <[email protected]>

* 1. Switched loss annealing to rely on self.trainer.global_step

Signed-off-by: Micha Livne <[email protected]>

* 1. Added comments regrding the use of return_ortho_loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added detailed logging of loss during training (still need to do the same for eval).

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing a fix to import bug.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging wrong import issue.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added logging of results to validation step (no tested yet).

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing failing immports.

Signed-off-by: Micha Livne <[email protected]>

* 1. Disabling changes.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Enabled bottleneck architecture.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed identation.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed import statement.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed typo.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed logging of arbitrary values.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed torch lightining logging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added a missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added NLPDDPPlugin.

Signed-off-by: Micha Livne <[email protected]>

* 1. Cleaned style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated sign of computed loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed double import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Moved logging of additional loss terms into MTBottleneckModel class.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated permissions.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added initial perceiver package.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on encoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. FInished implementing Perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated default arch.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Ignoring independant perceiver implementation.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added latent transformer to perceiver

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added TransformerBottleneckDecoderNM.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added TransformerBottleneckEncoderNM.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated bottleneck perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated MTBottleneckModel.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added BridgeEncoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Cleaned code.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated architecture name.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added support in bridge encoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added support in hidden_init_method to BridgeEncoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Removed unneeded imports.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated comment in YAML

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated YAML comments.
2. hidden_blocks in bridge relates to post-processing after bridge1. Updated YAML comments.
2. hidden_blocks in bridge relates to post-processing after bridge (instead of hidden_blocks-1).

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Initial cross attention in Perceiver with params init has independant parameters.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated Perciver forward.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated TransformerEncoder to be a component as opposed to a parent class.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated example command.

Signed-off-by: Micha Livne <[email protected]>

* 1. forward nethod in MTBottleneckModel does not compute loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added label smoothing for per-sample loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated recon_only loss to nll.

Signed-off-by: Micha Livne <[email protected]>

* 1. Update yaml doc.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated default config to have 32 hidden steps.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated doc.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed type.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed unreachable code bug.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed wrong sign for reconstruction per sample (instead of per token).

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated comments.

Signed-off-by: Micha Livne <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Somshubra Majumdar <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Co-authored-by: Eric Harper <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Jason <[email protected]>
Co-authored-by: Yang Zhang <[email protected]>
Co-authored-by: Boris Fomitchev <[email protected]>
Co-authored-by: Jagadeesh Balam <[email protected]>
Co-authored-by: Micha Livne <[email protected]>
Co-authored-by: Samuel Kriman <[email protected]>
Co-authored-by: Oktai Tatanov <[email protected]>
Signed-off-by: Jason <[email protected]>
paarthneekhara pushed a commit to paarthneekhara/NeMo that referenced this pull request Sep 17, 2021
* upper bound for webdataset

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* Correct Dockerfile

Signed-off-by: smajumdar <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update readmes

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update README (NVIDIA#2332)

Signed-off-by: Oleksii Kuchaiev <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* ddp translate GPU allocation fix (NVIDIA#2312)

* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <[email protected]>

* ddp translate GPU allocation fix

Signed-off-by: AlexGrinch <[email protected]>

* map_location instead of set_device

Signed-off-by: AlexGrinch <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Shallow fusion (NVIDIA#2315)

* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <[email protected]>

* shallow fusion init commit

Signed-off-by: AlexGrinch <[email protected]>

* debug info removed

Signed-off-by: AlexGrinch <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337)

* upper bound hydra

Signed-off-by: ericharper <[email protected]>

* upper bound hydra

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* update version number

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update package version

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320)

* add jenkins test, refactoring

Signed-off-by: ekmb <[email protected]>

* update test

Signed-off-by: ekmb <[email protected]>

* fix new test

Signed-off-by: ekmb <[email protected]>

* add serial to the default normalizer, add tests

Signed-off-by: ekmb <[email protected]>

* manifest test added

Signed-off-by: ekmb <[email protected]>

* expose more params, new test cases

Signed-off-by: ekmb <[email protected]>

* fix jenkins, serial clean, exclude range from cardinal

Signed-off-by: ekmb <[email protected]>

* jenkins

Signed-off-by: ekmb <[email protected]>

* jenkins dollar sign format

Signed-off-by: ekmb <[email protected]>

* jenkins

Signed-off-by: ekmb <[email protected]>

* jenkins dollar sign format

Signed-off-by: ekmb <[email protected]>

* addressed review comments

Signed-off-by: ekmb <[email protected]>

* fix decimal in measure

Signed-off-by: ekmb <[email protected]>

* move serial in cardinal

Signed-off-by: ekmb <[email protected]>

* sh tests init

Signed-off-by: ekmb <[email protected]>

* sparrowhawk container tests support added

Signed-off-by: ekmb <[email protected]>

* add post process to normalize.py, update tests

Signed-off-by: ekmb <[email protected]>

* remove duplication

Signed-off-by: ekmb <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update notebooks to 1.0.2 release (NVIDIA#2338)

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update ranges for omegaconf and hydra (NVIDIA#2336)

* Update ranges

Signed-off-by: smajumdar <[email protected]>

* Updates for Hydra and OmegaConf updates

Signed-off-by: smajumdar <[email protected]>

* Style fixes

Signed-off-by: smajumdar <[email protected]>

* Correct tests and revert patch for model utils

Signed-off-by: smajumdar <[email protected]>

* Correct docstring

Signed-off-by: smajumdar <[email protected]>

* Revert unnecessary change

Signed-off-by: smajumdar <[email protected]>

* Revert unnecessary change

Signed-off-by: smajumdar <[email protected]>

* Guard scheduler for None

Signed-off-by: smajumdar <[email protected]>

* default to 0.0 if bpe_dropout is None

Signed-off-by: ericharper <[email protected]>

* Correctly log class that was restored

Signed-off-by: smajumdar <[email protected]>

* Root patch *bpe_dropout

Signed-off-by: smajumdar <[email protected]>

Co-authored-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update FastPitch Export (NVIDIA#2355)

Signed-off-by: Jason <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* update out_dir to not collide (NVIDIA#2358)

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update container version to 21.05 (NVIDIA#2309)

* Update container version

Signed-off-by: smajumdar <[email protected]>

* Temporarily change export format of waveglow

Signed-off-by: smajumdar <[email protected]>

* Add conda update for numba

Signed-off-by: smajumdar <[email protected]>

* Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests

Signed-off-by: smajumdar <[email protected]>

* Correct order of numba minimum verion, remove wrong flag from test

Signed-off-by: smajumdar <[email protected]>

* Double test of cuda numba

Signed-off-by: smajumdar <[email protected]>

* Double test of cuda numba

Signed-off-by: smajumdar <[email protected]>

* Enable RNNT tests

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Text Normalization Update (NVIDIA#2356)

* upper cased date support

Signed-off-by: ekmb <[email protected]>

* update whitelist, change roman weights

Signed-off-by: ekmb <[email protected]>

* docstrings, space fix, init file

Signed-off-by: ekmb <[email protected]>

* lgtm

Signed-off-by: ekmb <[email protected]>

* fraction with measure class

Signed-off-by: ekmb <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346)

* Add ASR CTC Language finetuning notebook

Signed-off-by: smajumdar <[email protected]>

* Add to documentation

Signed-off-by: smajumdar <[email protected]>

* Improve documentation

Signed-off-by: smajumdar <[email protected]>

* Correct name of the dataset

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Correct colab link to notebook (NVIDIA#2366)

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* sgdqa update data directories for testing (NVIDIA#2323)

* sgdqa update data directories for testing

Signed-off-by: Yang Zhang <[email protected]>

* fix syntax

Signed-off-by: Yang Zhang <[email protected]>

* check if data dir exists

Signed-off-by: Yang Zhang <[email protected]>

* fix

Signed-off-by: Yang Zhang <[email protected]>

* adding pretrained model

Signed-off-by: Yang Zhang <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Added documentation for export() (NVIDIA#2330)

* Added export document

Signed-off-by: Boris Fomitchev <[email protected]>

* Addressed review comments

Signed-off-by: Boris Fomitchev <[email protected]>

Co-authored-by: Eric Harper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update Citrinet model card info (NVIDIA#2369)

* Update model card info

Signed-off-by: smajumdar <[email protected]>

* Cleanup Docs

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* [NMT] Model Parallel Megatron Encoders (NVIDIA#2238)

* add megatron encoder

Signed-off-by: ericharper <[email protected]>

* added megatron to get_nmt_tokenizer

Signed-off-by: ericharper <[email protected]>

* add vocab_size and hidden_size to megatron bert

Signed-off-by: ericharper <[email protected]>

* add megatron encoder module

Signed-off-by: ericharper <[email protected]>

* fixed horrible typo

Signed-off-by: ericharper <[email protected]>

* fix typo and add default

Signed-off-by: ericharper <[email protected]>

* updating nlp overrides for mp nmt

Signed-off-by: ericharper <[email protected]>

* move some logic back to nlpmodel from overrides

Signed-off-by: ericharper <[email protected]>

* add checkpoint_file property

Signed-off-by: ericharper <[email protected]>

* fix property

Signed-off-by: ericharper <[email protected]>

* num_tokentypes=0

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* find_unused_parameters=True

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* style

Signed-off-by: ericharper <[email protected]>

* get instead of pop

Signed-off-by: ericharper <[email protected]>

* remove token type ids from megatron input example

Signed-off-by: ericharper <[email protected]>

* pop vocab_size

Signed-off-by: ericharper <[email protected]>

* fix checkpointing for model parallel

Signed-off-by: ericharper <[email protected]>

* fix bug in non model parallel

Signed-off-by: ericharper <[email protected]>

* convert cfg.trainer to dict

Signed-off-by: ericharper <[email protected]>

* make num_tokentypes configurable for nmt

Signed-off-by: ericharper <[email protected]>

* update checkpoint_file when using named megatron model in nemo

Signed-off-by: ericharper <[email protected]>

* make vocab_file configurable

Signed-off-by: ericharper <[email protected]>

* dataclass can't have mutable default

Signed-off-by: ericharper <[email protected]>

* style

Signed-off-by: ericharper <[email protected]>

* unused imports

Signed-off-by: ericharper <[email protected]>

* revert input example

Signed-off-by: ericharper <[email protected]>

* check that checkpoint version is not None

Signed-off-by: ericharper <[email protected]>

* add mp jenkins test

Signed-off-by: ericharper <[email protected]>

* update docstring

Signed-off-by: ericharper <[email protected]>

* add docs for pretrained encoders with nemo nmt

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Add notebook with recommendations for 8 kHz speech (NVIDIA#2326)

* Added a notebook with best practices for telephony speech

* Added datasets detaiils

* Added training recommendations

* Emptied out cells with results

* Added tutorial to docs

Signed-off-by: jbalam <[email protected]>

* Addressed review comments

Signed-off-by: jbalam <[email protected]>

* Added a line to note original sampling rate of an4

Signed-off-by: jbalam <[email protected]>

* Made changes suggested in review

Signed-off-by: jbalam <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Working on bottleneck transformers.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on bottleneck transformers.

* 1. Done cleaning code of bottleneck transformers.
2. Ready to test.

Signed-off-by: Micha Livne <[email protected]>

* 1. Done cleaning code of bottleneck transformers.
2. Ready to test.

* 1. Working on training script.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on training script.

* 1. Updated config class name.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated config class name.

* 1. Training script is ready to be tested.

Signed-off-by: Micha Livne <[email protected]>

* 1. Training script is ready to be tested.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* Add FastEmit support for RNNT Losses (NVIDIA#2374)

* Temp commit

Signed-off-by: smajumdar <[email protected]>

* Initial code for fastemit forward pass

Signed-off-by: smajumdar <[email protected]>

* Correct return reg value

Signed-off-by: smajumdar <[email protected]>

* Initial cpu impl

Signed-off-by: smajumdar <[email protected]>

* Try gpu impl

Signed-off-by: smajumdar <[email protected]>

* Try gpu impl

Signed-off-by: smajumdar <[email protected]>

* Correct few impl

Signed-off-by: smajumdar <[email protected]>

* Update fastemit scaling

Signed-off-by: smajumdar <[email protected]>

* Cleanup fastemit

Signed-off-by: smajumdar <[email protected]>

* Finalize FastEmit regularization PR

Signed-off-by: smajumdar <[email protected]>

* Refactor code to support fastemit regularization

Signed-off-by: smajumdar <[email protected]>

Co-authored-by: Samuel Kriman <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Fixed bugs.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed bugs.

* 1. Fixed missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed missing import.

* 1. Fixed support in seq2seq-br.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed support in seq2seq-br.

* 1. Added NLPDDPPlugin.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added NLPDDPPlugin.

* fix bugs in hifigan code (NVIDIA#2392)

Signed-off-by: Oktai Tatanov <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update setup.py (NVIDIA#2394)

Signed-off-by: Jason <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Updated to support multi-node training.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added comments.

Signed-off-by: Micha Livne <[email protected]>

* 1. MTBottleneckModel is in its own file mt_enc_dec_bottleneck_model.

Signed-off-by: Micha Livne <[email protected]>

* 1. Switched loss annealing to rely on self.trainer.global_step

Signed-off-by: Micha Livne <[email protected]>

* 1. Added comments regrding the use of return_ortho_loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added detailed logging of loss during training (still need to do the same for eval).

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing a fix to import bug.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging wrong import issue.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added logging of results to validation step (no tested yet).

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing failing immports.

Signed-off-by: Micha Livne <[email protected]>

* 1. Disabling changes.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Enabled bottleneck architecture.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed identation.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed import statement.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed typo.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed logging of arbitrary values.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed torch lightining logging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added a missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added NLPDDPPlugin.

Signed-off-by: Micha Livne <[email protected]>

* 1. Cleaned style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated sign of computed loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed double import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Moved logging of additional loss terms into MTBottleneckModel class.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated permissions.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added initial perceiver package.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on encoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. FInished implementing Perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated default arch.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Ignoring independant perceiver implementation.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added latent transformer to perceiver

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added TransformerBottleneckDecoderNM.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added TransformerBottleneckEncoderNM.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated bottleneck perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated MTBottleneckModel.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added BridgeEncoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Cleaned code.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated architecture name.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added support in bridge encoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added support in hidden_init_method to BridgeEncoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Removed unneeded imports.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated comment in YAML

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated YAML comments.
2. hidden_blocks in bridge relates to post-processing after bridge1. Updated YAML comments.
2. hidden_blocks in bridge relates to post-processing after bridge (instead of hidden_blocks-1).

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Initial cross attention in Perceiver with params init has independant parameters.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated Perciver forward.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated TransformerEncoder to be a component as opposed to a parent class.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated example command.

Signed-off-by: Micha Livne <[email protected]>

* 1. forward nethod in MTBottleneckModel does not compute loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added label smoothing for per-sample loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated recon_only loss to nll.

Signed-off-by: Micha Livne <[email protected]>

* 1. Update yaml doc.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated default config to have 32 hidden steps.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated doc.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed type.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed unreachable code bug.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed wrong sign for reconstruction per sample (instead of per token).

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated comments.

Signed-off-by: Micha Livne <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Somshubra Majumdar <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Co-authored-by: Eric Harper <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Jason <[email protected]>
Co-authored-by: Yang Zhang <[email protected]>
Co-authored-by: Boris Fomitchev <[email protected]>
Co-authored-by: Jagadeesh Balam <[email protected]>
Co-authored-by: Micha Livne <[email protected]>
Co-authored-by: Samuel Kriman <[email protected]>
Co-authored-by: Oktai Tatanov <[email protected]>
Signed-off-by: Paarth Neekhara <[email protected]>
jfsantos pushed a commit to jfsantos/NeMo that referenced this pull request Nov 19, 2021
* upper bound for webdataset

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* Correct Dockerfile

Signed-off-by: smajumdar <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update readmes

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update README (NVIDIA#2332)

Signed-off-by: Oleksii Kuchaiev <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* ddp translate GPU allocation fix (NVIDIA#2312)

* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <[email protected]>

* ddp translate GPU allocation fix

Signed-off-by: AlexGrinch <[email protected]>

* map_location instead of set_device

Signed-off-by: AlexGrinch <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Shallow fusion (NVIDIA#2315)

* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <[email protected]>

* shallow fusion init commit

Signed-off-by: AlexGrinch <[email protected]>

* debug info removed

Signed-off-by: AlexGrinch <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337)

* upper bound hydra

Signed-off-by: ericharper <[email protected]>

* upper bound hydra

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* update version number

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* update package version

Signed-off-by: Oleksii Kuchaiev <[email protected]>

Signed-off-by: Micha Livne <[email protected]>

* sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320)

* add jenkins test, refactoring

Signed-off-by: ekmb <[email protected]>

* update test

Signed-off-by: ekmb <[email protected]>

* fix new test

Signed-off-by: ekmb <[email protected]>

* add serial to the default normalizer, add tests

Signed-off-by: ekmb <[email protected]>

* manifest test added

Signed-off-by: ekmb <[email protected]>

* expose more params, new test cases

Signed-off-by: ekmb <[email protected]>

* fix jenkins, serial clean, exclude range from cardinal

Signed-off-by: ekmb <[email protected]>

* jenkins

Signed-off-by: ekmb <[email protected]>

* jenkins dollar sign format

Signed-off-by: ekmb <[email protected]>

* jenkins

Signed-off-by: ekmb <[email protected]>

* jenkins dollar sign format

Signed-off-by: ekmb <[email protected]>

* addressed review comments

Signed-off-by: ekmb <[email protected]>

* fix decimal in measure

Signed-off-by: ekmb <[email protected]>

* move serial in cardinal

Signed-off-by: ekmb <[email protected]>

* sh tests init

Signed-off-by: ekmb <[email protected]>

* sparrowhawk container tests support added

Signed-off-by: ekmb <[email protected]>

* add post process to normalize.py, update tests

Signed-off-by: ekmb <[email protected]>

* remove duplication

Signed-off-by: ekmb <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update notebooks to 1.0.2 release (NVIDIA#2338)

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update ranges for omegaconf and hydra (NVIDIA#2336)

* Update ranges

Signed-off-by: smajumdar <[email protected]>

* Updates for Hydra and OmegaConf updates

Signed-off-by: smajumdar <[email protected]>

* Style fixes

Signed-off-by: smajumdar <[email protected]>

* Correct tests and revert patch for model utils

Signed-off-by: smajumdar <[email protected]>

* Correct docstring

Signed-off-by: smajumdar <[email protected]>

* Revert unnecessary change

Signed-off-by: smajumdar <[email protected]>

* Revert unnecessary change

Signed-off-by: smajumdar <[email protected]>

* Guard scheduler for None

Signed-off-by: smajumdar <[email protected]>

* default to 0.0 if bpe_dropout is None

Signed-off-by: ericharper <[email protected]>

* Correctly log class that was restored

Signed-off-by: smajumdar <[email protected]>

* Root patch *bpe_dropout

Signed-off-by: smajumdar <[email protected]>

Co-authored-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update FastPitch Export (NVIDIA#2355)

Signed-off-by: Jason <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* update out_dir to not collide (NVIDIA#2358)

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update container version to 21.05 (NVIDIA#2309)

* Update container version

Signed-off-by: smajumdar <[email protected]>

* Temporarily change export format of waveglow

Signed-off-by: smajumdar <[email protected]>

* Add conda update for numba

Signed-off-by: smajumdar <[email protected]>

* Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests

Signed-off-by: smajumdar <[email protected]>

* Correct order of numba minimum verion, remove wrong flag from test

Signed-off-by: smajumdar <[email protected]>

* Double test of cuda numba

Signed-off-by: smajumdar <[email protected]>

* Double test of cuda numba

Signed-off-by: smajumdar <[email protected]>

* Enable RNNT tests

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Text Normalization Update (NVIDIA#2356)

* upper cased date support

Signed-off-by: ekmb <[email protected]>

* update whitelist, change roman weights

Signed-off-by: ekmb <[email protected]>

* docstrings, space fix, init file

Signed-off-by: ekmb <[email protected]>

* lgtm

Signed-off-by: ekmb <[email protected]>

* fraction with measure class

Signed-off-by: ekmb <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346)

* Add ASR CTC Language finetuning notebook

Signed-off-by: smajumdar <[email protected]>

* Add to documentation

Signed-off-by: smajumdar <[email protected]>

* Improve documentation

Signed-off-by: smajumdar <[email protected]>

* Correct name of the dataset

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Correct colab link to notebook (NVIDIA#2366)

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* sgdqa update data directories for testing (NVIDIA#2323)

* sgdqa update data directories for testing

Signed-off-by: Yang Zhang <[email protected]>

* fix syntax

Signed-off-by: Yang Zhang <[email protected]>

* check if data dir exists

Signed-off-by: Yang Zhang <[email protected]>

* fix

Signed-off-by: Yang Zhang <[email protected]>

* adding pretrained model

Signed-off-by: Yang Zhang <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Added documentation for export() (NVIDIA#2330)

* Added export document

Signed-off-by: Boris Fomitchev <[email protected]>

* Addressed review comments

Signed-off-by: Boris Fomitchev <[email protected]>

Co-authored-by: Eric Harper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update Citrinet model card info (NVIDIA#2369)

* Update model card info

Signed-off-by: smajumdar <[email protected]>

* Cleanup Docs

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* [NMT] Model Parallel Megatron Encoders (NVIDIA#2238)

* add megatron encoder

Signed-off-by: ericharper <[email protected]>

* added megatron to get_nmt_tokenizer

Signed-off-by: ericharper <[email protected]>

* add vocab_size and hidden_size to megatron bert

Signed-off-by: ericharper <[email protected]>

* add megatron encoder module

Signed-off-by: ericharper <[email protected]>

* fixed horrible typo

Signed-off-by: ericharper <[email protected]>

* fix typo and add default

Signed-off-by: ericharper <[email protected]>

* updating nlp overrides for mp nmt

Signed-off-by: ericharper <[email protected]>

* move some logic back to nlpmodel from overrides

Signed-off-by: ericharper <[email protected]>

* add checkpoint_file property

Signed-off-by: ericharper <[email protected]>

* fix property

Signed-off-by: ericharper <[email protected]>

* num_tokentypes=0

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* find_unused_parameters=True

Signed-off-by: ericharper <[email protected]>

* typo

Signed-off-by: ericharper <[email protected]>

* style

Signed-off-by: ericharper <[email protected]>

* get instead of pop

Signed-off-by: ericharper <[email protected]>

* remove token type ids from megatron input example

Signed-off-by: ericharper <[email protected]>

* pop vocab_size

Signed-off-by: ericharper <[email protected]>

* fix checkpointing for model parallel

Signed-off-by: ericharper <[email protected]>

* fix bug in non model parallel

Signed-off-by: ericharper <[email protected]>

* convert cfg.trainer to dict

Signed-off-by: ericharper <[email protected]>

* make num_tokentypes configurable for nmt

Signed-off-by: ericharper <[email protected]>

* update checkpoint_file when using named megatron model in nemo

Signed-off-by: ericharper <[email protected]>

* make vocab_file configurable

Signed-off-by: ericharper <[email protected]>

* dataclass can't have mutable default

Signed-off-by: ericharper <[email protected]>

* style

Signed-off-by: ericharper <[email protected]>

* unused imports

Signed-off-by: ericharper <[email protected]>

* revert input example

Signed-off-by: ericharper <[email protected]>

* check that checkpoint version is not None

Signed-off-by: ericharper <[email protected]>

* add mp jenkins test

Signed-off-by: ericharper <[email protected]>

* update docstring

Signed-off-by: ericharper <[email protected]>

* add docs for pretrained encoders with nemo nmt

Signed-off-by: ericharper <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Add notebook with recommendations for 8 kHz speech (NVIDIA#2326)

* Added a notebook with best practices for telephony speech

* Added datasets detaiils

* Added training recommendations

* Emptied out cells with results

* Added tutorial to docs

Signed-off-by: jbalam <[email protected]>

* Addressed review comments

Signed-off-by: jbalam <[email protected]>

* Added a line to note original sampling rate of an4

Signed-off-by: jbalam <[email protected]>

* Made changes suggested in review

Signed-off-by: jbalam <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Working on bottleneck transformers.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on bottleneck transformers.

* 1. Done cleaning code of bottleneck transformers.
2. Ready to test.

Signed-off-by: Micha Livne <[email protected]>

* 1. Done cleaning code of bottleneck transformers.
2. Ready to test.

* 1. Working on training script.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on training script.

* 1. Updated config class name.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated config class name.

* 1. Training script is ready to be tested.

Signed-off-by: Micha Livne <[email protected]>

* 1. Training script is ready to be tested.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* Add FastEmit support for RNNT Losses (NVIDIA#2374)

* Temp commit

Signed-off-by: smajumdar <[email protected]>

* Initial code for fastemit forward pass

Signed-off-by: smajumdar <[email protected]>

* Correct return reg value

Signed-off-by: smajumdar <[email protected]>

* Initial cpu impl

Signed-off-by: smajumdar <[email protected]>

* Try gpu impl

Signed-off-by: smajumdar <[email protected]>

* Try gpu impl

Signed-off-by: smajumdar <[email protected]>

* Correct few impl

Signed-off-by: smajumdar <[email protected]>

* Update fastemit scaling

Signed-off-by: smajumdar <[email protected]>

* Cleanup fastemit

Signed-off-by: smajumdar <[email protected]>

* Finalize FastEmit regularization PR

Signed-off-by: smajumdar <[email protected]>

* Refactor code to support fastemit regularization

Signed-off-by: smajumdar <[email protected]>

Co-authored-by: Samuel Kriman <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

* 1. Fixed bugs.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed bugs.

* 1. Fixed missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed missing import.

* 1. Fixed support in seq2seq-br.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed support in seq2seq-br.

* 1. Added NLPDDPPlugin.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added NLPDDPPlugin.

* fix bugs in hifigan code (NVIDIA#2392)

Signed-off-by: Oktai Tatanov <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* Update setup.py (NVIDIA#2394)

Signed-off-by: Jason <[email protected]>
Signed-off-by: Micha Livne <[email protected]>

* 1. Updated to support multi-node training.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added comments.

Signed-off-by: Micha Livne <[email protected]>

* 1. MTBottleneckModel is in its own file mt_enc_dec_bottleneck_model.

Signed-off-by: Micha Livne <[email protected]>

* 1. Switched loss annealing to rely on self.trainer.global_step

Signed-off-by: Micha Livne <[email protected]>

* 1. Added comments regrding the use of return_ortho_loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added detailed logging of loss during training (still need to do the same for eval).

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing a fix to import bug.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging wrong import issue.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added logging of results to validation step (no tested yet).

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing failing immports.

Signed-off-by: Micha Livne <[email protected]>

* 1. Disabling changes.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Enabled bottleneck architecture.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed identation.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed import statement.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed typo.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed logging of arbitrary values.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed torch lightining logging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added a missing import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added NLPDDPPlugin.

Signed-off-by: Micha Livne <[email protected]>

* 1. Cleaned style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated sign of computed loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed double import.

Signed-off-by: Micha Livne <[email protected]>

* 1. Moved logging of additional loss terms into MTBottleneckModel class.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated permissions.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added initial perceiver package.

Signed-off-by: Micha Livne <[email protected]>

* 1. Working on encoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Testing perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. FInished implementing Perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated default arch.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Ignoring independant perceiver implementation.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added latent transformer to perceiver

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added TransformerBottleneckDecoderNM.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added TransformerBottleneckEncoderNM.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated bottleneck perceiver.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated MTBottleneckModel.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added BridgeEncoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Cleaned code.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated architecture name.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added support in bridge encoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added support in hidden_init_method to BridgeEncoder.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Removed unneeded imports.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated comment in YAML

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated YAML comments.
2. hidden_blocks in bridge relates to post-processing after bridge1. Updated YAML comments.
2. hidden_blocks in bridge relates to post-processing after bridge (instead of hidden_blocks-1).

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Initial cross attention in Perceiver with params init has independant parameters.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated Perciver forward.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated TransformerEncoder to be a component as opposed to a parent class.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated example command.

Signed-off-by: Micha Livne <[email protected]>

* 1. forward nethod in MTBottleneckModel does not compute loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Added label smoothing for per-sample loss.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated recon_only loss to nll.

Signed-off-by: Micha Livne <[email protected]>

* 1. Update yaml doc.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated default config to have 32 hidden steps.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated doc.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed type.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed unreachable code bug.

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed wrong sign for reconstruction per sample (instead of per token).

Signed-off-by: Micha Livne <[email protected]>

* 1. Debugging.

Signed-off-by: Micha Livne <[email protected]>

* 1. Fixed style.

Signed-off-by: Micha Livne <[email protected]>

* 1. Updated comments.

Signed-off-by: Micha Livne <[email protected]>

Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Somshubra Majumdar <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]>
Co-authored-by: Sandeep Subramanian <[email protected]>
Co-authored-by: Eric Harper <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Jason <[email protected]>
Co-authored-by: Yang Zhang <[email protected]>
Co-authored-by: Boris Fomitchev <[email protected]>
Co-authored-by: Jagadeesh Balam <[email protected]>
Co-authored-by: Micha Livne <[email protected]>
Co-authored-by: Samuel Kriman <[email protected]>
Co-authored-by: Oktai Tatanov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet