Test issue #1 #1

moconnor725 · 2019-08-14T20:31:32Z

No description provided.

update repo

fixing style of binh234 PR for vietnamese itn

Signed-off-by: Boris Fomitchev <[email protected]>

* cache-aware streaming export Test onnx streaming conformer ctc WER Constant att cache width with len param Remove some extra functions in cache_aware runner transpose cache so that batch is first for trt Signed-off-by: Greg Clark <[email protected]> * fix export for full-context conformer * WIP trying to improve onnx perf Signed-off-by: Greg Clark <[email protected]> * Adding test scripts Signed-off-by: Greg Clark <[email protected]> * More perf testing script Signed-off-by: Greg Clark <[email protected]> * Updates for jit torch_tensorrt tracing Signed-off-by: Greg Clark <[email protected]> * Fixed trace warnings Signed-off-by: Boris Fomitchev <[email protected]> * Rearranging tests Signed-off-by: Boris Fomitchev <[email protected]> * Fixing non-caching case Signed-off-by: Boris Fomitchev <[email protected]> * testing Signed-off-by: Boris Fomitchev <[email protected]> * Fixed channel cache length issue Signed-off-by: Boris Fomitchev <[email protected]> * cache-aware streaming export Test onnx streaming conformer ctc WER Constant att cache width with len param Remove some extra functions in cache_aware runner transpose cache so that batch is first for trt Signed-off-by: Greg Clark <[email protected]> * fix export for full-context conformer * WIP trying to improve onnx perf Signed-off-by: Greg Clark <[email protected]> * Adding test scripts Signed-off-by: Greg Clark <[email protected]> * More perf testing script Signed-off-by: Greg Clark <[email protected]> * Updates for jit torch_tensorrt tracing Signed-off-by: Greg Clark <[email protected]> * stash Signed-off-by: Boris Fomitchev <[email protected]> * Reverting non-essential changes Signed-off-by: Boris Fomitchev <[email protected]> * Offset=None case Signed-off-by: Boris Fomitchev <[email protected]> * Remove test scripts Signed-off-by: Greg Clark <[email protected]> * Clean up speech_to_text_cache_aware_streaming_infer Signed-off-by: Greg Clark <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert pad -> constant_pad_nd Signed-off-by: Greg Clark <[email protected]> * conformer-encoder set window_size from streaming_cfg Signed-off-by: Greg Clark <[email protected]> * Fixes for working export(), using more constants Signed-off-by: Boris Fomitchev <[email protected]> * Optional rand init for cahce Signed-off-by: Greg Clark <[email protected]> * Folding update_cache with constants Signed-off-by: Boris Fomitchev <[email protected]> * More folding Signed-off-by: Boris Fomitchev <[email protected]> * Reducing diff #1 Signed-off-by: Boris Fomitchev <[email protected]> * Reducing diff #2 Signed-off-by: Boris Fomitchev <[email protected]> * Reducing diff #3 Signed-off-by: Boris Fomitchev <[email protected]> * Fixed unit tests, more reverts Signed-off-by: Boris Fomitchev <[email protected]> * Export fixes Signed-off-by: Boris Fomitchev <[email protected]> * Reverted slice changes that ruined ONNX perf Signed-off-by: Boris Fomitchev <[email protected]> * Adding back keep_all_outputs and drop_extra_preencoded Signed-off-by: Greg Clark <[email protected]> * Fix export Signed-off-by: Greg Clark <[email protected]> --------- Signed-off-by: Greg Clark <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Vahid Noroozi <[email protected]>

* cache-aware streaming export Test onnx streaming conformer ctc WER Constant att cache width with len param Remove some extra functions in cache_aware runner transpose cache so that batch is first for trt Signed-off-by: Greg Clark <[email protected]> * fix export for full-context conformer * WIP trying to improve onnx perf Signed-off-by: Greg Clark <[email protected]> * Adding test scripts Signed-off-by: Greg Clark <[email protected]> * More perf testing script Signed-off-by: Greg Clark <[email protected]> * Updates for jit torch_tensorrt tracing Signed-off-by: Greg Clark <[email protected]> * Fixed trace warnings Signed-off-by: Boris Fomitchev <[email protected]> * Rearranging tests Signed-off-by: Boris Fomitchev <[email protected]> * Fixing non-caching case Signed-off-by: Boris Fomitchev <[email protected]> * testing Signed-off-by: Boris Fomitchev <[email protected]> * Fixed channel cache length issue Signed-off-by: Boris Fomitchev <[email protected]> * cache-aware streaming export Test onnx streaming conformer ctc WER Constant att cache width with len param Remove some extra functions in cache_aware runner transpose cache so that batch is first for trt Signed-off-by: Greg Clark <[email protected]> * fix export for full-context conformer * WIP trying to improve onnx perf Signed-off-by: Greg Clark <[email protected]> * Adding test scripts Signed-off-by: Greg Clark <[email protected]> * More perf testing script Signed-off-by: Greg Clark <[email protected]> * Updates for jit torch_tensorrt tracing Signed-off-by: Greg Clark <[email protected]> * stash Signed-off-by: Boris Fomitchev <[email protected]> * Reverting non-essential changes Signed-off-by: Boris Fomitchev <[email protected]> * Offset=None case Signed-off-by: Boris Fomitchev <[email protected]> * Remove test scripts Signed-off-by: Greg Clark <[email protected]> * Clean up speech_to_text_cache_aware_streaming_infer Signed-off-by: Greg Clark <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert pad -> constant_pad_nd Signed-off-by: Greg Clark <[email protected]> * conformer-encoder set window_size from streaming_cfg Signed-off-by: Greg Clark <[email protected]> * Fixes for working export(), using more constants Signed-off-by: Boris Fomitchev <[email protected]> * Optional rand init for cahce Signed-off-by: Greg Clark <[email protected]> * Folding update_cache with constants Signed-off-by: Boris Fomitchev <[email protected]> * More folding Signed-off-by: Boris Fomitchev <[email protected]> * Reducing diff #1 Signed-off-by: Boris Fomitchev <[email protected]> * Reducing diff #2 Signed-off-by: Boris Fomitchev <[email protected]> * Reducing diff #3 Signed-off-by: Boris Fomitchev <[email protected]> * Fixed unit tests, more reverts Signed-off-by: Boris Fomitchev <[email protected]> * Export fixes Signed-off-by: Boris Fomitchev <[email protected]> * Reverted slice changes that ruined ONNX perf Signed-off-by: Boris Fomitchev <[email protected]> * Adding back keep_all_outputs and drop_extra_preencoded Signed-off-by: Greg Clark <[email protected]> * Fix export Signed-off-by: Greg Clark <[email protected]> --------- Signed-off-by: Greg Clark <[email protected]> Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Vahid Noroozi <[email protected]> Signed-off-by: hsiehjackson <[email protected]>

) * add initial impl of ModularizedSpeechGPTModel and integration test * fix typo in the test name (#1) approve the nit change * clean a initial version of example config; make sure it works by test (#2) approve as no need to review * add the test for training_step and fix the code correspondingly (test passed now) (#3) * add test for validation_step (#4) * mv audio and text emb concat to prepare_llm_input so as to write test to guard the llm input * Merge heh and zhehuai's initial version of frozen am+llm (#5) * Merge heh and zhehuai's initial version of frozen am+llm The previous differences are summarized here: https://docs.google.com/document/d/1zNI4hC6vJtUfcHbrUSPaMuYWRBQdN_36H0P2NiBiuPY/edit This PR includes 1. Finish merging the model, dataset, and config code 2. Previous tests are still enabled and passed (prepare_llm_input, training_step, validation_step) 3. the example training script with LS960 has been run to make sure the training pipeline works The major remaining works are listed here https://docs.google.com/document/d/1o0AM7v4gcTQkPZjE0Vl9TTX4vYnGTrbXEFGWh0UhGlk/edit#bookmark=id.pzvdadt5oxyw --------- Co-authored-by: He Huang (Steve) <[email protected]> * fix a nit init bug broke test (#6) Signed-off-by: zhehuaichen <[email protected]> * Clean up implementation for SALM paper and sync to NEMO v1.20.0 (#18) * wip Signed-off-by: zhehuaichen <[email protected]> * fix data Signed-off-by: zhehuaichen <[email protected]> * fix consumed_samples Signed-off-by: zhehuaichen <[email protected]> * fix the training restart problem by storing adapter+perception model and init them from the ckpt Signed-off-by: zhehuaichen <[email protected]> * refix state dict Signed-off-by: zhehuaichen <[email protected]> * support wer and inf Signed-off-by: zhehuaichen <[email protected]> * nan guard Signed-off-by: zhehuaichen <[email protected]> * reimpl inf and bug fix Signed-off-by: zhehuaichen <[email protected]> * multi loader Signed-off-by: zhehuaichen <[email protected]> * unfreeze lm Signed-off-by: zhehuaichen <[email protected]> * flag for load am Signed-off-by: zhehuaichen <[email protected]> * tokenizer Signed-off-by: zhehuaichen <[email protected]> * overwrite vocab size Signed-off-by: zhehuaichen <[email protected]> * support bpe dropout Signed-off-by: zhehuaichen <[email protected]> * add tarred datasets Signed-off-by: stevehuang52 <[email protected]> * fix sample_alpha Signed-off-by: stevehuang52 <[email protected]> * fix bpe dropout bugs in the mismatched context in tokenization Signed-off-by: zhehuaichen <[email protected]> * add bleu metric Signed-off-by: stevehuang52 <[email protected]> * update metrics Signed-off-by: stevehuang52 <[email protected]> * support inference and fix a bug in wer calculation Signed-off-by: zhehuaichen <[email protected]> * fix bucketing dataset Signed-off-by: stevehuang52 <[email protected]> * fix bleu implementation Signed-off-by: zhehuaichen <[email protected]> * support question set file per dataset/data loader in preparation for multitask understanding; also fix bleu implementation Signed-off-by: zhehuaichen <[email protected]> * support simple random context for word boosting Signed-off-by: zhehuaichen <[email protected]> * use sacrebleu.corpus_bleu to be consistent with the rest Signed-off-by: zhehuaichen <[email protected]> * make audio_file optional in the data loader Signed-off-by: zhehuaichen <[email protected]> * add a tool to materialize mt and text data Signed-off-by: zhehuaichen <[email protected]> * compatible with tar dataset Signed-off-by: zhehuaichen <[email protected]> * temp fix for metric and speed up materialization Signed-off-by: zhehuaichen <[email protected]> * make num of context configurable Signed-off-by: zhehuaichen <[email protected]> * val_check_interval fix; make manifest dumping consistent with speech models Signed-off-by: zhehuaichen <[email protected]> * random_context_positive_ratio configurable to control precision Signed-off-by: zhehuaichen <[email protected]> * bug fix: freeze_llm flag is not passed to the model cfg Signed-off-by: zhehuaichen <[email protected]> * overwrite tensor_model_parallel_size Signed-off-by: zhehuaichen <[email protected]> * support both stt and ssl models for loading audio encoder Signed-off-by: zhehuaichen <[email protected]> * fix the inference config so as to use sampling; allow inference config update in training Signed-off-by: zhehuaichen <[email protected]> * refactorize and clean up code for preprocessing collections, dataset interface, model inference and rename some classes to be consistent with salm paper. also make sure test passed Signed-off-by: zhehuaichen <[email protected]> * Undo changes in megatron_gpt_peft_models.py and move them to speechllm_models.py; make sure the correctness by test_speechllm_models.py::TestModularizedAudioGPTModel::test_predict_step Signed-off-by: zhehuaichen <[email protected]> * update default inference config and test golden value accordingly Signed-off-by: zhehuaichen <[email protected]> * integration test and minor fix Signed-off-by: zhehuaichen <[email protected]> * nit bug fix on manifest_filepath introduced by code cleanup Signed-off-by: zhehuaichen <[email protected]> * update workspace/ files; consider moving to examples later Signed-off-by: zhehuaichen <[email protected]> * further remove unnecessary stuff in the inference implementation Signed-off-by: zhehuaichen <[email protected]> * revert the update in default end_string to be compatible with legacy models Signed-off-by: zhehuaichen <[email protected]> --------- Signed-off-by: zhehuaichen <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: stevehuang52 <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> * rename 'ModularizedAudioGPTModel' to 'ModularAudioGPTLoRAModel'; move speechllm stuff under nemo/collections/multimodal/speechllm Signed-off-by: zhehuaichen <[email protected]> * update copyright; remove workspace/scripts and workspace/tools folders since the main branch has LLaMA support Signed-off-by: zhehuaichen <[email protected]> --------- Signed-off-by: zhehuaichen <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: Zhehuai Chen <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: stevehuang52 <[email protected]>

approve the nit change

Adds wrapper for siglip to allow tiling

Minor update lab.py for helping information (NVIDIA#1) Signed-off-by: lixuemin2016 <[email protected]> Signed-off-by: Kai Xu <[email protected]> Co-authored-by: Kai Xu <[email protected]>

Update README.md lab to ilab

moconnor725 closed this as completed Aug 14, 2019

zhampel mentioned this issue Aug 26, 2019

Unidecode module error when unit-testing #2

Closed

okuchaiev mentioned this issue Feb 12, 2020

new version of neural type system #307

Merged

7 tasks

ekmb mentioned this issue Apr 7, 2020

Token classification inference on CPU #557

Closed

okuchaiev pushed a commit that referenced this issue Jun 3, 2020

Merge pull request #1 from NVIDIA/master

79bc099

update repo

mutkach mentioned this issue Dec 30, 2020

DALI pipeline bug #1597

Closed

yzhang123 pushed a commit that referenced this issue Apr 19, 2022

Merge pull request #1 from NVIDIA/vi_itn

35de619

fixing style of binh234 PR for vietnamese itn

1shershah mentioned this issue Aug 25, 2022

Fine tuning stt_en_conformer_transducer_xxlarge gives a dtype error on GPU ( A100) #4812

Closed

borisfom added a commit to borisfom/NeMo that referenced this issue Mar 6, 2023

Reducing diff NVIDIA#1

b39407d

Signed-off-by: Boris Fomitchev <[email protected]>

pzelasko referenced this issue in pzelasko/NeMo Nov 29, 2023

fix typo in the test name (#1)

03364d6

approve the nit change

pzelasko mentioned this issue Jan 26, 2024

Script for estimating Lhotse dynamic duration buckets #8237

Merged

8 tasks

pyf98 pushed a commit to pyf98/NeMo that referenced this issue May 21, 2024

fix typo in the test name (NVIDIA#1)

b4e4035

approve the nit change

dchichkov pushed a commit to dchichkov/NeMo that referenced this issue Jun 3, 2024

Merge pull request NVIDIA#1 from dchichkov/tiled_siglip

e6a3a53

Adds wrapper for siglip to allow tiling

dcurran90 pushed a commit to dcurran90/NeMo that referenced this issue Oct 15, 2024

Merge pull request NVIDIA#1 from jimmysjolund/jimmysjolund-patch-1

73e41bd

Update README.md lab to ilab

pablo-garay mentioned this issue Oct 29, 2024

Add copyright check #11048

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test issue #1 #1

Test issue #1 #1

moconnor725 commented Aug 14, 2019

Test issue #1 #1

Test issue #1 #1

Comments

moconnor725 commented Aug 14, 2019