Fix Python type hints according to Python Docs #5370

artbataev · 2022-11-09T12:29:50Z

What does this PR do ?

Fix Python type annotations according to the documentation (correct syntax, missing imports, fix incompatible return types).

Collection: [Note which collection this PR will affect]

Changelog

remove duplicated type annotations
fix tuple annotation in function return types according to python docs
add missing imports (only when module is used, but not imported)
fix return type in some places where return type is obvious, but not compatible with the annotation (e.g. use Optional[str] instead of str if return None is used)
avoid quotes for type annotations in some places (if possible): quotes are useful only for forward declaration, in other cases this can lead to inconsistent or incorrect types.
- Also, if quotes are used and the necessary type is not imported, IDEs / linters can't correctly infer the type. If the type is imported, some tools (including the LGTM bot) consider such import as unused.

Usage

No executed code is affected (except for missing imports).
Changes only affect type hints by PyCharm or type checkers output (e.g. syntax error in mypy).

Example

Before (incorrect return type – unclear automatic signature in PyCharm):

After (corrected type – PyCharm correctly adds hints):

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: Vladimir Bataev <[email protected]>

lgtm-com · 2022-11-09T12:58:14Z

This pull request introduces 1 alert when merging c15fb14 into 265056e - view on LGTM.com

new alerts:

1 for Unused import

Signed-off-by: Vladimir Bataev <[email protected]>

This reverts commit ea433ef. Signed-off-by: Vladimir Bataev <[email protected]>

Signed-off-by: Vladimir Bataev <[email protected]>

lgtm-com · 2022-11-09T14:44:00Z

This pull request fixes 1 alert when merging da3dc14 into 265056e - view on LGTM.com

fixed alerts:

1 for Unused import

titu1994

Revert every change wrt str based type - they are there specifically to avoid having to import that which may cause circular import

titu1994 · 2022-11-09T17:47:44Z

nemo/collections/asr/models/asr_model.py

@@ -53,7 +54,7 @@ def multi_test_epoch_end(self, outputs, dataloader_idx: int = 0):
        return {'test_loss': val_loss_mean, 'log': tensorboard_logs}

    @classmethod
-    def list_available_models(cls) -> 'List[PretrainedModelInfo]':
+    def list_available_models(cls) -> List[PretrainedModelInfo]:
        """


Revert. We only use str quote for this to avoid circular dependency

Reverted quotes

titu1994 · 2022-11-09T17:48:06Z

nemo/collections/asr/models/ctc_bpe_models.py

@@ -371,7 +371,7 @@ def change_decoding_strategy(self, decoding_cfg: DictConfig):
        logging.info(f"Changed decoding strategy to \n{OmegaConf.to_yaml(self.cfg.decoding)}")

    @classmethod
-    def list_available_models(cls) -> Optional[PretrainedModelInfo]:
+    def list_available_models(cls) -> List[PretrainedModelInfo]:


Revert all of these

The purpose of these changes was inconsistent type for list_available_models method:

in base class it was Optional[PretrainedModelInfo] (nemo.core.classes.Model)

in some classes there was no return (ok for this type)

in some classes (e.g. here) a list is returned, so actual type is List[PretrainedModelInfo]

in some classes also List[PretrainedModelInfo] was used for type annotation

So, I used List[PretrainedModelInfo] in case of the list was returned, and Optional[List[PretrainedModelInfo]] for base class (to avoid any sort of incompatibility).

I will revert quotes in all places they were used.

Should I really revert all type annotations for list_available_models? Can we keep the type correct and consistent everywhere?

If I correctly understood our conversation in Slack, it's ok (there were no quotes here, actually List is returned)

titu1994 · 2022-11-09T17:48:20Z

nemo/collections/asr/models/rnnt_models.py

@@ -996,7 +996,7 @@ def decoder_joint(self):
        return RNNTDecoderJoint(self.decoder, self.joint)

    @classmethod
-    def list_available_models(cls) -> Optional[PretrainedModelInfo]:
+    def list_available_models(cls) -> List[PretrainedModelInfo]:


Same here and above

If I correctly understood our conversation in Slack, it's ok (there were no quotes here, actually List is returned)

titu1994 · 2022-11-09T17:49:15Z

nemo/collections/nlp/models/zero_shot_intent_recognition/zero_shot_intent_model.py

@@ -262,7 +262,7 @@ def predict(
        return result

    @classmethod
-    def list_available_models(cls) -> Optional[PretrainedModelInfo]:
+    def list_available_models(cls) -> List[PretrainedModelInfo]:
        """


Same as above

titu1994 · 2022-11-09T17:49:29Z

nemo/collections/nlp/modules/common/megatron/fused_bias_dropout_add.py

@@ -44,7 +44,6 @@ def bias_dropout_add(x, bias, residual, prob, training):
 def bias_dropout_add_fused_train_(
    x: torch.Tensor, bias: torch.Tensor, residual: torch.Tensor, prob: float
 ) -> torch.Tensor:
-    # type: (Tensor, Tensor, Tensor, float) -> Tensor


Revert. It is needed for tracing

titu1994 · 2022-11-09T17:49:54Z

nemo/collections/tts/models/fastpitch.py

@@ -558,7 +558,7 @@ def setup_test_data(self, cfg):
        pass

    @classmethod
-    def list_available_models(cls) -> 'List[PretrainedModelInfo]':
+    def list_available_models(cls) -> List[PretrainedModelInfo]:


Reverted quotes

nemo/collections/tts/models/two_stages.py

titu1994 · 2022-11-09T17:50:17Z

nemo/core/classes/common.py

@@ -657,7 +658,7 @@ class Model(Typing, Serialization, FileIO):

    @classmethod
    @abstractmethod
-    def list_available_models(cls) -> Optional[PretrainedModelInfo]:
+    def list_available_models(cls) -> Optional[List[PretrainedModelInfo]]:


If I correctly understood our conversation in Slack, it's ok.

There were no quotes.

Optional[List[PretrainedModelInfo]] to indicate the desired type List[PretrainedModelInfo] and Optional to maintain the compatibility

titu1994 · 2022-11-09T17:50:41Z

nemo/core/classes/modelPT.py

@@ -239,7 +239,7 @@ def save_to(self, save_path: str):
            save_path: Path to .nemo file where model instance should be saved
        """

-        def maybe_make_save_dir(path: 'pathlib.Path'):
+        def maybe_make_save_dir(path: Path):


Reverted this

titu1994 · 2022-11-09T17:51:29Z

nemo/utils/export_utils.py

@@ -213,7 +213,7 @@ def run_ort_and_compare(sess, ort_input, output_example, check_tolerance=0.01):
    from apex.transformer.tensor_parallel.layers import RowParallelLinear, ColumnParallelLinear
    from apex.transformer.functional.fused_softmax import FusedScaleMaskSoftmax

-    def replace_FusedLayerNorm(n: nn.Module) -> Optional[nn.BatchNorm2d]:
+    def replace_FusedLayerNorm(n: nn.Module) -> Optional[nn.LayerNorm]:


Revert for now, it's mixed case use type is incorrect anyway

Signed-off-by: Vladimir Bataev <[email protected]>

lgtm-com · 2022-11-09T19:06:14Z

This pull request fixes 1 alert when merging b0c8d7b into c5c46ba - view on LGTM.com

fixed alerts:

1 for Unused import

Signed-off-by: Vladimir Bataev <[email protected]>

lgtm-com · 2022-11-09T19:29:17Z

This pull request fixes 1 alert when merging 2c12623 into c5c46ba - view on LGTM.com

fixed alerts:

1 for Unused import

titu1994

Looks good. I still dunno what this will help with. MyPy can't possibly work on the entire Nemo toolkit but I guess it's a start

SeanNaren

Thanks for this @artbataev, happy with the base function having Optional[PretrainedModelInfo] as the base Type and the inherited classes changing this to match their return.

MyPy can't possibly work on the entire Nemo toolkit but I guess it's a start

Would be nice to run MyPy over the sub-portions of the code to start, would help to keep things clean :)

artbataev · 2022-11-10T11:54:44Z

@SeanNaren, @titu1994 MyPy can work with the project, just have to ignore some files + make some small changes. I'm currently using MyPy.
Maybe further I can provide a minimal working prototype.

lgtm-com · 2022-11-10T12:04:20Z

This pull request fixes 1 alert when merging b712ab8 into 9838b3b - view on LGTM.com

fixed alerts:

1 for Unused import

* Remove duplicated type annotations Signed-off-by: Vladimir Bataev <[email protected]> * Fix tuple annotations in function return types Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix unused import (avoid quotes in type annotations) Signed-off-by: Vladimir Bataev <[email protected]> * Revert "Fix unused import (avoid quotes in type annotations)" This reverts commit ea433ef. Signed-off-by: Vladimir Bataev <[email protected]> * Remove problematic import Signed-off-by: Vladimir Bataev <[email protected]> * Fix list_available_models method type Signed-off-by: Vladimir Bataev <[email protected]> * Revert some changes Signed-off-by: Vladimir Bataev <[email protected]> * Revert quotes in list_available_models Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: 1-800-bad-code <[email protected]>

* remove stage wrapper from parallel Signed-off-by: Oleksii Kuchaiev <[email protected]> * fixed the onnx bug in conformer for non-streaming models. (#5242) Signed-off-by: Vahid <[email protected]> Signed-off-by: Vahid <[email protected]> * [Tools][ASR] Tool for generating data using simulated RIRs (#5158) [Tools][ASR] Tool for generating data using simulated RIRs Signed-off-by: Ante Jukić <[email protected]> * Add fully torch.jit.script-able speaker clustering module (#5191) * Add files for commit Signed-off-by: Taejin Park <[email protected]> * Added parallelism on p-value search Signed-off-by: Taejin Park <[email protected]> * Changed speaker clustering to accept torch.tensor Signed-off-by: Taejin Park <[email protected]> * Cleaned up the code and tested to have identical output Signed-off-by: Taejin Park <[email protected]> * update on Notebook demo Signed-off-by: Taejin Park <[email protected]> * Added eigvalsh for faster eig val calculation: Signed-off-by: Taejin Park <[email protected]> * Remove NMESC_JitScriptedModule.ipynb Signed-off-by: Taejin Park <[email protected]> * Cleaned code and style fix Signed-off-by: Taejin Park <[email protected]> * Modified MSDD framework to fit torch-scripted clustering Signed-off-by: Taejin Park <[email protected]> * LGTM fix Signed-off-by: Taejin Park <[email protected]> * removed all string based timestamps Signed-off-by: Taejin Park <[email protected]> * Removed unnecessary lines Signed-off-by: Taejin Park <[email protected]> * removed redundant lines Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Taejin Park <[email protected]> * Update perturb.py (#5231) * Update perturb.py Add checking for channels mismatch for audio and noise data, throw an exception if they have different number of channels. Also fixed `perturb_with_foreground_noise` as done in `perturb_with_input_noise` Signed-off-by: He Huang (Steve) <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update check and teest Signed-off-by: stevehuang52 <[email protected]> * fix test Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * remove CV requirements. (#5233) Signed-off-by: Xuesong Yang <[email protected]> * Fix link to inference notebook (#5247) (#5251) Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> Co-authored-by: Jocelyn <[email protected]> * checks for accepted adapter type at module level (#5194) * add accepted adapter functionality into transformer, mlp and attention Signed-off-by: arendu <[email protected]> * fix to t5 adapter and ia3 evals due to predict_step dictionary key changes Signed-off-by: arendu <[email protected]> * use mixin logic for adapters in ParallelAttention and ParallelMLP classes Signed-off-by: arendu <[email protected]> * typo fix Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * moved adapter tools Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix error with t5 adapter Signed-off-by: arendu <[email protected]> * updates' Signed-off-by: arendu <[email protected]> * replace ColumnParallelLinear with nn.Linear in export_utils Signed-off-by: arendu <[email protected]> * remove ColumnLinear Signed-off-by: arendu <[email protected]> * typo fix Signed-off-by: arendu <[email protected]> * update to check config targets Signed-off-by: arendu <[email protected]> * updates Signed-off-by: arendu <[email protected]> * refactor so that mixin is adapter name agnostic Signed-off-by: arendu <[email protected]> * fix merge conflict Signed-off-by: arendu <[email protected]> * minor Signed-off-by: arendu <[email protected]> * minor Signed-off-by: arendu <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * using class comparison instead of string match Signed-off-by: arendu <[email protected]> * fix test fail Signed-off-by: arendu <[email protected]> * fixed checks for add_adapter Signed-off-by: arendu <[email protected]> * fixed checks for add_adapter Signed-off-by: arendu <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <[email protected]> * fix groovy syntax Signed-off-by: Oleksii Kuchaiev <[email protected]> * fix hypotheses return (#5253) Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> * Update ASR scores table (#5254) (#5255) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Support for inserting additional subsampling in conformer encoder (#5224) * Change the default position of the reduction position to null and rename subsampling reduction to striding Signed-off-by: Shantanu Acharya <[email protected]> * Put the caching logic outside the conformer encoder Signed-off-by: Shantanu Acharya <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add description of the reduction parameters in the configs Signed-off-by: Shantanu Acharya <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_asr_exportables with correct reduction position value Signed-off-by: Shantanu Acharya <[email protected]> Signed-off-by: Shantanu Acharya <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * asr and nmt tests in parallel Signed-off-by: Oleksii Kuchaiev <[email protected]> * add more users who can trigger blossom-ci Signed-off-by: Oleksii Kuchaiev <[email protected]> * path fix Signed-off-by: Oleksii Kuchaiev <[email protected]> * fix paths, remove redundant test Signed-off-by: Oleksii Kuchaiev <[email protected]> * add symlink Signed-off-by: Oleksii Kuchaiev <[email protected]> * Modernize RNNT ONNX export and add TS export (#5248) * Upgrade rnnt export for CUDA/CPU/TRT Signed-off-by: smajumdar <[email protected]> * Update runtime script for onnx exported model to modern API Signed-off-by: smajumdar <[email protected]> * Finalize code Signed-off-by: smajumdar <[email protected]> * Remove comments Signed-off-by: smajumdar <[email protected]> * Remove redundant stuff from tests Signed-off-by: smajumdar <[email protected]> * Update test Signed-off-by: smajumdar <[email protected]> * Remove onnx rnnt export test due to lack of onnxruntime install Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> * update tutorials to use meeting config as default and VAD (#5237) * update tutorials to use meeting config as default and VAD Signed-off-by: nithinraok <[email protected]> * update model path Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> * Fix links to speaker identification notebook (#5260) (#5261) Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: SeanNaren <[email protected]> Co-authored-by: Sean Naren <[email protected]> * add shm-size Signed-off-by: Oleksii Kuchaiev <[email protected]> * [TTS] Fastpitch energy condition and refactoring (#5218) * Incorporating Energy conditioning in FastPitch Signed-off-by: subhankar-ghosh <[email protected]> * Minor fixes in Energy conditioning in FastPitch Signed-off-by: subhankar-ghosh <[email protected]> * Add Energy conditioning in FastPitch to infer method Signed-off-by: subhankar-ghosh <[email protected]> * adding fn to function names Signed-off-by: subhankar-ghosh <[email protected]> * Incorporating Energy conditioning in FastPitch Signed-off-by: subhankar-ghosh <[email protected]> * Minor fixes in Energy conditioning in FastPitch Signed-off-by: subhankar-ghosh <[email protected]> * Add Energy conditioning in FastPitch to infer method Signed-off-by: subhankar-ghosh <[email protected]> * adding fn to function names Signed-off-by: subhankar-ghosh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove ifelse from batching, minor refactoring changes in energy code Signed-off-by: subhankar-ghosh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Refactor based on PR comments. Signed-off-by: subhankar-ghosh <[email protected]> * Added support for not learning alignment in energy Signed-off-by: subhankar-ghosh <[email protected]> * Fix typo in assert statemetn Signed-off-by: subhankar-ghosh <[email protected]> * Renaming average_pitch to average_features Signed-off-by: subhankar-ghosh <[email protected]> * Renaming len variable name as it is a keyword Signed-off-by: subhankar-ghosh <[email protected]> * Renaming len variable name as it is a keyword Signed-off-by: subhankar-ghosh <[email protected]> Signed-off-by: subhankar-ghosh <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Xuesong Yang <[email protected]> * [TTS] HiFi-TTS Download Script (#5241) * Hifi tts download script Signed-off-by: Oleksii Volkovskyi <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Oleksii Volkovskyi <[email protected]> * comment and remove imports Signed-off-by: Oleksii Volkovskyi <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Oleksii Volkovskyi <[email protected]> Signed-off-by: Oleksii Volkovskyi <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Specifying audio signal dropout separately for the Conformer Encoder (#5263) * Fixed bug in transcribe_speech.py where decoding strategy was not being updated. Signed-off-by: Shantanu Acharya <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add option to specify audio dropout separately for conformer encoders Signed-off-by: Shantanu Acharya <[email protected]> * Add audio dropout option to test_asr_exportables Signed-off-by: Shantanu Acharya <[email protected]> * Rename dropout_audio to dropout_pre_encode Signed-off-by: Shantanu Acharya <[email protected]> * Update the comments in squeezeformer configs referring to conformer modules Signed-off-by: Shantanu Acharya <[email protected]> Signed-off-by: Shantanu Acharya <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * created (#5268) * created * bug Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> * [TTS] Add Mandarin/English Bilingual Recipe for Training Fastpitch Models (#5208) * Add Chinese TTS tokenizer and G2P. * Add data process script. * Add tutorial. Signed-off-by: Yuekai Zhang <[email protected]> * Minor typo fixes in TTS tutorial (#5266) (#5272) Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> Co-authored-by: Jocelyn <[email protected]> * Fix failing speaker counting for short audio samples (#5267) * Add files for commit Signed-off-by: Taejin Park <[email protected]> * Added parallelism on p-value search Signed-off-by: Taejin Park <[email protected]> * Changed speaker clustering to accept torch.tensor Signed-off-by: Taejin Park <[email protected]> * Cleaned up the code and tested to have identical output Signed-off-by: Taejin Park <[email protected]> * update on Notebook demo Signed-off-by: Taejin Park <[email protected]> * Added eigvalsh for faster eig val calculation: Signed-off-by: Taejin Park <[email protected]> * Remove NMESC_JitScriptedModule.ipynb Signed-off-by: Taejin Park <[email protected]> * Cleaned code and style fix Signed-off-by: Taejin Park <[email protected]> * Modified MSDD framework to fit torch-scripted clustering Signed-off-by: Taejin Park <[email protected]> * LGTM fix Signed-off-by: Taejin Park <[email protected]> * removed all string based timestamps Signed-off-by: Taejin Park <[email protected]> * Removed unnecessary lines Signed-off-by: Taejin Park <[email protected]> * removed redundant lines Signed-off-by: Taejin Park <[email protected]> * Add enhanced speaker count back Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed minor docstrings Signed-off-by: Taejin Park <[email protected]> * removed import Counter Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Taejin Park <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Pcla tutorial fixes (#5271) (#5273) * Fixed typos Signed-off-by: Matvei Novikov <[email protected]> * Fixed cell type and tatoeba reference Signed-off-by: Matvei Novikov <[email protected]> * Fixed typo Signed-off-by: Matvei Novikov <[email protected]> * Fixed branch variable Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> * Add Gradio App to ASR Docs (#5270) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> * Fix bug into Dialogue tutorial (#5277) (#5280) Co-authored-by: Zhilin Wang <[email protected]> * [TTS] fixed type of filepath and rename openslr. (#5276) Signed-off-by: Xuesong Yang <[email protected]> * O2bert + apex pipeline functions (#5221) * Global batch size support for validation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Global batch size support for bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bert batch support * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bert batch size support * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * O2 support for bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_pretraining.py Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_config.yaml Signed-off-by: Shanmugam Ramasamy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <[email protected]> * Bug fix * Bug fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Bug fix * Bug fix * Bug fix * Update megatron_bert_config.yaml Signed-off-by: Shanmugam Ramasamy <[email protected]> * Addressed Sandeeps comments Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update Jenkinsfile Signed-off-by: Shanmugam Ramasamy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Typo fix (#5288) (#5291) Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> * Upperbound PTL (#5302) * Upperbound PTL Signed-off-by: smajumdar <[email protected]> * Upperbound PTL Signed-off-by: smajumdar <[email protected]> * Upperbound PTL Signed-off-by: smajumdar <[email protected]> * Upperbound PTL Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> * Add support for Sampled Softmax for RNNT Joint (#5216) * Initial prototype of SampldRNNTJoint Signed-off-by: smajumdar <[email protected]> * Implement randperm based noise selection algo Signed-off-by: smajumdar <[email protected]> * First working prototype of sampled rnnt ! Signed-off-by: smajumdar <[email protected]> * Add note for why we need this remap before i forget Signed-off-by: smajumdar <[email protected]> * Finalize version that works with sampling Signed-off-by: smajumdar <[email protected]> * Update docs for rnnt decoder and joint Signed-off-by: smajumdar <[email protected]> * Remove the adjustment_val for softmax Signed-off-by: smajumdar <[email protected]> * Update config and docs for Sampled Softmax Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> * Update Interface(s) phonetic entry (#5212) * change interface(s) phone Signed-off-by: Jason <[email protected]> * push version Signed-off-by: Jason <[email protected]> * update dict path Signed-off-by: Jason <[email protected]> Signed-off-by: Jason <[email protected]> * [TTS] remove obsolete torch_tts unit test marker and replace with run_only_on('CPU') (#5307) Signed-off-by: Xuesong Yang <[email protected]> * Fixes for Conformer-xl export (#5309) * Fixing runtime check for ONNX > 2G Signed-off-by: Boris Fomitchev <[email protected]> * Fixing ONNX export Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> * add label inference support to EncDecSpeakerLabel class (#5278) * add label inference support to EncDecSpeakerLabel class Signed-off-by: nithinraok <[email protected]> * add necessary tests Signed-off-by: nithinraok <[email protected]> * reflect on comments Signed-off-by: nithinraok <[email protected]> * grammatical correction Signed-off-by: nithinraok <[email protected]> * minor doc string changes Signed-off-by: nithinraok <[email protected]> Signed-off-by: nithinraok <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> * [TTS] bugfix IPAG2P and refactor to remove duplicate process. (#5304) * [TTS] bugfix IPAG2P and refactor to remove duplicate process. * added type hints and rename func. * unify str and list(str) as list(str). * revise logging message when phoneme_dict_obj is empty Signed-off-by: Xuesong Yang <[email protected]> * Update path to get_data.py in TTS tutorial (#5311) Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> * Add italian model checkpoints (#5315) Signed-off-by: Igor Gitman <[email protected]> Signed-off-by: Igor Gitman <[email protected]> * Text Memmap Parsing Improvements (#5265) * 1. Fixed text-memmap issue when boundary (new-line) is missing from end of file). Signed-off-by: Micha Livne <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in paratial sample loading and alternative decoding. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed syntax issues. Signed-off-by: Micha Livne <[email protected]> * 1. Minor change. Signed-off-by: Micha Livne <[email protected]> * 1. Extended flexibility of mapping indices. Signed-off-by: Micha Livne <[email protected]> * 1. Added validation ofdtype of indexing function. Signed-off-by: Micha Livne <[email protected]> Signed-off-by: Micha Livne <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * [TTS] Replace IPA lambda arguments with locale string (#5298) * [TTS] Replace IPA lambda arguments with locale string * [TTS] Add locale validation * Fixed typos * Return punctuation as sorted list Signed-off-by: Ryan <[email protected]> * Remove onnx graphsurgery from Dockerfile (#5320) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> * Update refspec (#5321) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> * Force wav file format for audio_filepath (#5323) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> * Updates to T0 Dataset and Model (#5201) * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fix Signed-off-by: MaximumEntropy <[email protected]> * Fixes Signed-off-by: MaximumEntropy <[email protected]> * Update config Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Restore function needed for NMT Signed-off-by: MaximumEntropy <[email protected]> * Fix config Signed-off-by: MaximumEntropy <[email protected]> * Change output file format from JSON to JSONL Signed-off-by: MaximumEntropy <[email protected]> * Add T0 data preproc scripts Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Merge and multiprocessing Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix for is_correct Signed-off-by: MaximumEntropy <[email protected]> * Refactor T0 dataset Signed-off-by: MaximumEntropy <[email protected]> * Add script to merge train folder into individual training files to minimize number of blends Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minor changes Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove bin compat Signed-off-by: MaximumEntropy <[email protected]> * Fix header lines Signed-off-by: MaximumEntropy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [DOC] add sphinx-copybutton requirement to copy button on code snippets. (#5326) Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> * [TTS] expand to support flexible dictionary entry formats in IPAG2P. (#5318) * expand to support flexible dictionary entry formats in IPAG2P. * removed unused imports in test.collections.tts * removed unused imports in nemo.collections.tts.modules * removed unused imports in nemo_text_processing.text_normalization.zh * updated unit tests with new cases * renamed test function names because we only test IPAG2P rather than all classes in the modules.py. * revise current test dict with a single space between word and pronunications. Signed-off-by: Xuesong Yang <[email protected]> * small bugfix for r1.13.0 (#5310) (#5325) * typo fix * udpate transcribe Signed-off-by: fayejf <[email protected]> Co-authored-by: fayejf <[email protected]> * Option to pad the last validation input sequence if its smaller than the encoder sequence length for MegatronGPT (#5243) * Option to pad the last input sequence of validation dataset if its smaller than the encoder sequence length for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added default value for drop last argument Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Anmol Gupta <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Add support for Hydra multirun to NeMo (#5159) * Update execution doc and remove old snippet Signed-off-by: smajumdar <[email protected]> * Fix types Signed-off-by: smajumdar <[email protected]> * Fix defaults Signed-off-by: smajumdar <[email protected]> * Fix types for ParallelAdapterConfig Signed-off-by: smajumdar <[email protected]> * Add hash for config cache Signed-off-by: smajumdar <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add support to delete redundant ckpt files for HP search Signed-off-by: smajumdar <[email protected]> * Correct config for IA3 Signed-off-by: smajumdar <[email protected]> * Fix check to <= 0 Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * typo fix (#5328) * Speed up HF data processing script for ASR (#5330) * Correct hydra issue with relative filepaths Signed-off-by: smajumdar <[email protected]> * Improve speed of dataset processing Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> * add precommit hood to automatic sort entries in requirements. (#5333) Signed-off-by: Xuesong Yang <[email protected]> * [TTS] update organization of model checkpoints and their pointers. (#5327) * [TTS] update orgnization of model checkpoints and their pointers. Signed-off-by: Xuesong Yang <[email protected]> * move model name column to the 2nd col and correct model names as predefined_model_name. Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> * Add speaker clustering arguments to forward function (#5306) * Move arguments to forward function Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolved type issue Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Taejin Park <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [STT] Add Ru ASR Conformer-CTC and Conformer-Transducer (#5340) (#5341) * [STT] Add stt_ru_conformer_ctc_large Signed-off-by: Sasha Meister <[email protected]> * [STT] Add stt_ru_conformer_transducer_large Add stt_ru_conformer_transducer_large Signed-off-by: Sasha Meister <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Sasha Meister <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Sasha Meister <[email protected]> Co-authored-by: Sasha Meister <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [TTS] bugfix for the script of generating mels. (#5344) Signed-off-by: Xuesong Yang <[email protected]> * Fixing de-autocast (#5319) * Fixing de-autocast Signed-off-by: Boris Fomitchev <[email protected]> * Cleanup Signed-off-by: Boris Fomitchev <[email protected]> * Refining export with max_dim/batch Signed-off-by: Boris Fomitchev <[email protected]> * Moving cast utils to its own module Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> * Pcla tutorial fixes (#5313) (#5347) * fixes Signed-off-by: Matvei Novikov <[email protected]> * fixes Signed-off-by: Matvei Novikov <[email protected]> * moved `create_text_and_labels` to token_classification_utils.py Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> * bug (#5348) Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * [Bugfix] Added rm -f / wget- nc command to avoid bash error in multispeaker sim notebook (#5292) * Added rm -f command to avoid error message Signed-off-by: Taejin Park <[email protected]> * removed unnecessary changes Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Taejin Park <[email protected]> * [DOC] added ipython dependency to support IPython.sphinxext extension (#5345) * [DOC] added ipython dependency to support IPython.sphinxext extension Signed-off-by: Xuesong Yang <[email protected]> * revert ipython extension in the doc and replace ipython block with shell-session. Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> * Fix dialogue tutorial bug (#5297) (#5303) * set add_pooling_layer=False for huggingface bert model * remove add_pooling_layer=False and set find_unused_parameters=True * set num_prompt_tokens to 0 for huggingface Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Fix issue with HF Model upload tutorial (#5359) (#5360) * Add Gradio App to ASR Docs (#5270) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> (cherry picked from commit e4b6a387e3b3d9cdf511f7b9bbb5e94925e48cc2) * Fix issue with normalized config for dataset name Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Bug fix (removing old compute consumed samples) (#5355) Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> * removed uninstall nemo_cv and nemo_simple_gan and relax numba version… (#5332) * Update reinstall.sh and requirements. * removed nemo_cv and nemo_simple_gan in reinstall.sh. * relaxed numba version limits. * added tensorboard requirement to avoid any incpmpatible issue. Signed-off-by: Xuesong Yang <[email protected]> * revert changes for numba Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> * Pipeline paralleism in Bert (#5293) * Global batch size support for validation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Global batch size support for bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bert batch support * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bert batch size support * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * O2 support for bert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_pretraining.py Signed-off-by: Shanmugam Ramasamy <[email protected]> * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_config.yaml Signed-off-by: Shanmugam Ramasamy <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <[email protected]> * Bug fix * Bug fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Bug fix * Bug fix * Bug fix * Update megatron_bert_config.yaml Signed-off-by: Shanmugam Ramasamy <[email protected]> * PPBert * PPBert * PPBert * PPBert * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_config.yaml Signed-off-by: Shanmugam Ramasamy <[email protected]> * bug fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bug fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bug fix * bug fix * bug fix * bug fix Signed-off-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <[email protected]> * tutorial fixes (#5354) (#5361) Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> * Enable mlflow logger (#4893) * Enable mlflow logger Signed-off-by: whrichd <[email protected]> * fix style Signed-off-by: whrichd <[email protected]> * Add doc lines. Signed-off-by: whrichd <[email protected]> * change default value Signed-off-by: whrichd <[email protected]> * fix doc Signed-off-by: whrichd <[email protected]> * addressed comments, added dataclass Signed-off-by: whrichd <[email protected]> * fix style Signed-off-by: whrichd <[email protected]> * fix doc Signed-off-by: whrichd <[email protected]> Signed-off-by: whrichd <[email protected]> * Add SDP documentation (#5274) (#5376) * Add details to SDP README.md Signed-off-by: Elena Rastorgueva <[email protected]> * Add docstring to WriteManifest processor Signed-off-by: Elena Rastorgueva <[email protected]> * Add docstring to CreateInitialManifestMLS Signed-off-by: Elena Rastorgueva <[email protected]> * Add ModifyManifestTextProcessor docstring Signed-off-by: Elena Rastorgueva <[email protected]> * Add ASRInference docstring Signed-off-by: Elena Rastorgueva <[email protected]> * Add base_processor docstrings Signed-off-by: Elena Rastorgueva <[email protected]> * Add minimal SDP docs page Signed-off-by: Elena Rastorgueva <[email protected]> * Update tools/speech_dataset_processor/README.md Co-authored-by: Igor Gitman <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> * Write simple README for SDP and move complex explanations to docs Signed-off-by: Elena Rastorgueva <[email protected]> * Remove incorrect type hints Signed-off-by: Elena Rastorgueva <[email protected]> * Make config example less confusing Signed-off-by: Elena Rastorgueva <[email protected]> * Fix typo Signed-off-by: Elena Rastorgueva <[email protected]> * Clarify that YAML file is config file in README Signed-off-by: Elena Rastorgueva <[email protected]> * Remove unused imports Signed-off-by: Elena Rastorgueva <[email protected]> * Remove SDP docs for now Signed-off-by: Elena Rastorgueva <[email protected]> * Remove links to docs in SDP README Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: Igor Gitman <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Igor Gitman <[email protected]> * Rename Speech Dataset Processor to Speech Data Processor (#5378) (#5381) Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> * fix for num worker 0 causing issues in losses after 1 epoch (#5379) (#5384) Co-authored-by: Adi Renduchintala <[email protected]> * [TTS] Add Spanish model documentation (#5390) Signed-off-by: Ryan <[email protected]> * [TTS] Add Spanish FastPitch training configs (#5383) * [TTS] Add Spanish FastPitch training configs * [TTS] Add single speaker Spanish configs Signed-off-by: Ryan <[email protected]> * Fix Python type hints according to Python Docs (#5370) * Remove duplicated type annotations Signed-off-by: Vladimir Bataev <[email protected]> * Fix tuple annotations in function return types Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix unused import (avoid quotes in type annotations) Signed-off-by: Vladimir Bataev <[email protected]> * Revert "Fix unused import (avoid quotes in type annotations)" This reverts commit ea433efcd9916abf8944879e791484a0a1437f83. Signed-off-by: Vladimir Bataev <[email protected]> * Remove problematic import Signed-off-by: Vladimir Bataev <[email protected]> * Fix list_available_models method type Signed-off-by: Vladimir Bataev <[email protected]> * Revert some changes Signed-off-by: Vladimir Bataev <[email protected]> * Revert quotes in list_available_models Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> * Force MHA QKV onto fp32 (#5391) (#5395) Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: smajumdar <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> * Add cpWER for evaluation of ASR with diarization (#5279) * Add cpWER calculation feature Signed-off-by: Taejin Park <[email protected]> * added notebook Signed-off-by: Taejin Park <[email protected]> * updated notebook and diarization_utils Signed-off-by: Taejin Park <[email protected]> * Minor update on tutorial notebook Signed-off-by: Taejin Park <[email protected]> * Style fix Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update on missing docstrings Signed-off-by: Taejin Park <[email protected]> * Fixed an unfinished docstring Signed-off-by: Taejin Park <[email protected]> * Removed unused variables Signed-off-by: Taejin Park <[email protected]> * Fixed dict input to list input Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Style fix Signed-off-by: Taejin Park <[email protected]> * fixed LGTM issues Signed-off-by: Taejin Park <[email protected]> * Fixed error in cpWER cal Signed-off-by: Taejin Park <[email protected]> * fixed docstrings Signed-off-by: Taejin Park <[email protected]> * fixed docstrings Signed-off-by: Taejin Park <[email protected]> * Fix some of the typing issues, lower case names Signed-off-by: SeanNaren <[email protected]> * Replaced bruteforce with LSA alg for cpWER Signed-off-by: Taejin Park <[email protected]> * Reflected PR comments Signed-off-by: Taejin Park <[email protected]> * Cleaned notebook Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated notebook Signed-off-by: Taejin Park <[email protected]> * Fixed LGTM warnings Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added test_diar_metrics.py Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed typos Signed-off-by: Taejin Park <[email protected]> * Fixed wrong type annotations Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added bruteforce mode and its unit-test Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * LGTM issues fixed Signed-off-by: Taejin Park <[email protected]> * reolve LGTM issues Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unified speaker key in trans_dict Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed unused variable and imports Signed-off-by: Taejin Park <[email protected]> * Update nemo/collections/asr/parts/utils/diarization_utils.py Co-authored-by: Sean Naren <[email protected]> Signed-off-by: Taejin Park <[email protected]> * Update nemo/collections/asr/parts/utils/diarization_utils.py Co-authored-by: Sean Naren <[email protected]> Signed-off-by: Taejin Park <[email protected]> * moved all the diarization eval to der.py Signed-off-by: Taejin Park <[email protected]> * Update tests/collections/asr/test_diar_metrics.py Co-authored-by: Sean Naren <[email protected]> Signed-off-by: Taejin Park <[email protected]> * der.py update on tests Signed-off-by: Taejin Park <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused imports and style fix Signed-off-by: Taejin Park <[email protected]> * style fix Signed-off-by: Taejin Park <[email protected]> * unused import Signed-off-by: Taejin Park <[email protected]> * reflected review comments Signed-off-by: Taejin Park <[email protected]> * Fixed an import bug in tutorial notebook Signed-off-by: Taejin Park <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: SeanNaren <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: SeanNaren <[email protected]> Co-authored-by: Nithin Rao <[email protected]> * Added cast Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Vahid <[email protected]> Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: Taejin Park <[email protected]> Signed-off-by: He Huang (Steve) <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: Xuesong Yang <[email protected]> Signed-off-by: Jocelyn Huang <[email protected]> Signed-off-by: arendu <[email protected]> Signed-off-by: nithinraok <[email protected]> Signed-off-by: smajumdar <[email protected]> Signed-off-by: Shantanu Acharya <[email protected]> Signed-off-by: SeanNaren <[email protected]> Signed-off-by: subhankar-ghosh <[email protected]> Signed-off-by: Oleksii Volkovskyi <[email protected]> Signed-off-by: Dima Rekesh <[email protected]> Signed-off-by: Yuekai Zhang <[email protected]> Signed-off-by: Matvei Novikov <[email protected]> Signed-off-by: Shanmugam Ramasamy <[email protected]> Signed-off-by: Jason <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: Igor Gitman <[email protected]> Signed-off-by: Micha Livne <[email protected]> Signed-off-by: Ryan <[email protected]> Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: fayejf <[email protected]> Signed-off-by: Sasha Meister <[email protected]> Signed-off-by: whrichd <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Elena Rastorgueva <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: David <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Vahid Noroozi <[email protected]> Co-authored-by: anteju <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Xuesong Yang <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jocelyn <[email protected]> Co-authored-by: Adi Renduchintala <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Shantanu Acharya <[email protected]> Co-authored-by: Sean Naren <[email protected]> Co-authored-by: Subhankar Ghosh <[email protected]> Co-authored-by: Oleksii Volkovskyi <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Dima Rekesh <[email protected]> Co-authored-by: Yuekai Zhang <[email protected]> Co-authored-by: Matvei Novikov <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: Shanmugam Ramasamy <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Igor Gitman <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Ryan Langman <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: anmolgupt <[email protected]> Co-authored-by: Anmol Gupta <[email protected]> Co-authored-by: Sasha Meister <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Riqiang Wang <[email protected]> Co-authored-by: Elena Rastorgueva <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> Co-authored-by: David <[email protected]>

* Remove duplicated type annotations Signed-off-by: Vladimir Bataev <[email protected]> * Fix tuple annotations in function return types Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix unused import (avoid quotes in type annotations) Signed-off-by: Vladimir Bataev <[email protected]> * Revert "Fix unused import (avoid quotes in type annotations)" This reverts commit ea433ef. Signed-off-by: Vladimir Bataev <[email protected]> * Remove problematic import Signed-off-by: Vladimir Bataev <[email protected]> * Fix list_available_models method type Signed-off-by: Vladimir Bataev <[email protected]> * Revert some changes Signed-off-by: Vladimir Bataev <[email protected]> * Revert quotes in list_available_models Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Hainan Xu <[email protected]>

* Remove duplicated type annotations Signed-off-by: Vladimir Bataev <[email protected]> * Fix tuple annotations in function return types Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix unused import (avoid quotes in type annotations) Signed-off-by: Vladimir Bataev <[email protected]> * Revert "Fix unused import (avoid quotes in type annotations)" This reverts commit ea433ef. Signed-off-by: Vladimir Bataev <[email protected]> * Remove problematic import Signed-off-by: Vladimir Bataev <[email protected]> * Fix list_available_models method type Signed-off-by: Vladimir Bataev <[email protected]> * Revert some changes Signed-off-by: Vladimir Bataev <[email protected]> * Revert quotes in list_available_models Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]>

* Remove duplicated type annotations Signed-off-by: Vladimir Bataev <[email protected]> * Fix tuple annotations in function return types Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Add necessary imports Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix types in obvious places Signed-off-by: Vladimir Bataev <[email protected]> * Fix unused import (avoid quotes in type annotations) Signed-off-by: Vladimir Bataev <[email protected]> * Revert "Fix unused import (avoid quotes in type annotations)" This reverts commit ea433ef. Signed-off-by: Vladimir Bataev <[email protected]> * Remove problematic import Signed-off-by: Vladimir Bataev <[email protected]> * Fix list_available_models method type Signed-off-by: Vladimir Bataev <[email protected]> * Revert some changes Signed-off-by: Vladimir Bataev <[email protected]> * Revert quotes in list_available_models Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: andrusenkoau <[email protected]>

artbataev added 6 commits November 9, 2022 16:35

Remove duplicated type annotations

88a5c9d

Signed-off-by: Vladimir Bataev <[email protected]>

Fix tuple annotations in function return types

21e7c68

Signed-off-by: Vladimir Bataev <[email protected]>

Add necessary imports

df0e731

Signed-off-by: Vladimir Bataev <[email protected]>

Add necessary imports

925d38d

Signed-off-by: Vladimir Bataev <[email protected]>

Fix types in obvious places

475d80f

Signed-off-by: Vladimir Bataev <[email protected]>

Fix types in obvious places

9dc399e

Signed-off-by: Vladimir Bataev <[email protected]>

artbataev force-pushed the fix_python_typing branch 2 times, most recently from 200aa92 to 9dc399e Compare November 9, 2022 12:39

artbataev marked this pull request as ready for review November 9, 2022 12:41

Merge branch 'main' into fix_python_typing

c15fb14

Fix unused import (avoid quotes in type annotations)

ea433ef

Signed-off-by: Vladimir Bataev <[email protected]>

artbataev force-pushed the fix_python_typing branch from 5ea0663 to ea433ef Compare November 9, 2022 13:10

artbataev added 2 commits November 9, 2022 17:45

Revert "Fix unused import (avoid quotes in type annotations)"

727a481

This reverts commit ea433ef. Signed-off-by: Vladimir Bataev <[email protected]>

Remove problematic import

9e2193c

Signed-off-by: Vladimir Bataev <[email protected]>

artbataev force-pushed the fix_python_typing branch from 5ca9459 to 9e2193c Compare November 9, 2022 13:45

Fix list_available_models method type

da3dc14

Signed-off-by: Vladimir Bataev <[email protected]>

titu1994 requested changes Nov 9, 2022

View reviewed changes

Revert some changes

b0c8d7b

Signed-off-by: Vladimir Bataev <[email protected]>

Revert quotes in list_available_models

2c12623

Signed-off-by: Vladimir Bataev <[email protected]>

artbataev requested a review from titu1994 November 9, 2022 22:10

titu1994 approved these changes Nov 10, 2022

View reviewed changes

SeanNaren approved these changes Nov 10, 2022

View reviewed changes

Merge branch 'main' into fix_python_typing

b712ab8

titu1994 merged commit d4ce66d into NVIDIA:main Nov 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Python type hints according to Python Docs #5370

Fix Python type hints according to Python Docs #5370

artbataev commented Nov 9, 2022 •

edited

Loading

lgtm-com bot commented Nov 9, 2022

lgtm-com bot commented Nov 9, 2022

titu1994 left a comment

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

artbataev Nov 9, 2022

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

titu1994 Nov 9, 2022

artbataev Nov 9, 2022

lgtm-com bot commented Nov 9, 2022

lgtm-com bot commented Nov 9, 2022

titu1994 left a comment

SeanNaren left a comment •

edited

Loading

artbataev commented Nov 10, 2022

lgtm-com bot commented Nov 10, 2022

Fix Python type hints according to Python Docs #5370

Fix Python type hints according to Python Docs #5370

Conversation

artbataev commented Nov 9, 2022 • edited Loading

What does this PR do ?

Changelog

Usage

Example

Before your PR is "Ready for review"

Who can review?

Additional Information

lgtm-com bot commented Nov 9, 2022

lgtm-com bot commented Nov 9, 2022

titu1994 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lgtm-com bot commented Nov 9, 2022

lgtm-com bot commented Nov 9, 2022

titu1994 left a comment

Choose a reason for hiding this comment

SeanNaren left a comment • edited Loading

Choose a reason for hiding this comment

artbataev commented Nov 10, 2022

lgtm-com bot commented Nov 10, 2022

artbataev commented Nov 9, 2022 •

edited

Loading

SeanNaren left a comment •

edited

Loading