Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update script for ngram rnnt and hat beam search decoding #6370

Merged
merged 28 commits into from
Apr 21, 2023
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
5af82cf
add rnnt ngram beamsearch script
andrusenkoau Mar 29, 2023
6a3fe03
Merge branch 'main' of https://github.com/andrusenkoau/NeMo into merg…
andrusenkoau Apr 4, 2023
b552e46
add return encoding embedding option
andrusenkoau Apr 4, 2023
aab62bc
update script
andrusenkoau Apr 4, 2023
36992dc
add rnnt and hat ngram decoding script
andrusenkoau Apr 5, 2023
a791367
add some parameters
andrusenkoau Apr 5, 2023
430d1b0
Merge branch 'main' of https://github.com/andrusenkoau/NeMo into ngra…
andrusenkoau Apr 5, 2023
238fb25
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 5, 2023
ac8c221
Merge branch 'main' of https://github.com/andrusenkoau/NeMo into ngra…
andrusenkoau Apr 7, 2023
992ffcb
add return_encoder_embeddings parameter to RNNTDecodingConfig
andrusenkoau Apr 7, 2023
8bd09b0
replace return_encoder_embeddings parameter
andrusenkoau Apr 7, 2023
95fc4df
generalization of scipt behavior
andrusenkoau Apr 7, 2023
8d5e926
resolve conflicts
andrusenkoau Apr 7, 2023
2db619a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 7, 2023
24d34a1
Merge branch 'main' into ngram_rnnt
andrusenkoau Apr 11, 2023
2e04f17
remove return_encoder_embeddings parameter
andrusenkoau Apr 13, 2023
7afa9ef
remove return_encoder_embeddings parameter
andrusenkoau Apr 13, 2023
c2f0f6c
add manual encoder_embeddings calculation
andrusenkoau Apr 13, 2023
49e64d6
Merge branch 'main' into ngram_rnnt
andrusenkoau Apr 13, 2023
d07f41f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 13, 2023
780769e
Merge branch 'main' into ngram_rnnt
andrusenkoau Apr 14, 2023
80f4f61
Merge branch 'main' into ngram_rnnt
andrusenkoau Apr 17, 2023
9267044
Merge branch 'main' into ngram_rnnt
titu1994 Apr 18, 2023
fcf2421
Merge branch 'main' into ngram_rnnt
andrusenkoau Apr 19, 2023
64d17e4
fix beam_width value to 8
andrusenkoau Apr 19, 2023
df9b90e
fix rescoring description
andrusenkoau Apr 19, 2023
503160e
Merge branch 'main' into ngram_rnnt
andrusenkoau Apr 20, 2023
a621d2b
Merge branch 'main' into ngram_rnnt
titu1994 Apr 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion docs/source/asr/asr_language_modeling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ Language models have shown to help the accuracy of ASR models. NeMo supports the
* :ref:`ngram_modeling`
* :ref:`neural_rescoring`

It is possible to use both approaches on the same ASR model.
It is possible to use both approaches on the same CTC ASR model.
RNNT and Hybrid Autoregressive Transducer (HAT) models support only N-gram language modeling.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't user use rescoring with RNNT models?
The output of the beam search is a list of candiate which is independent of the type of the decoder. Then we pass this file to the rescorer, so the neural rescorer should support all ASR models?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, you are absolutely right. I forgot about rescoring of n-best results with NNLM. I meant that RNNT/HAT models do not support NNLM in shallow fusion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I returned the old description



.. _ngram_modeling:
Expand Down Expand Up @@ -281,6 +282,29 @@ For instance, the following set of parameters would results in 2*1*2=4 beam sear
beam_beta=[1.0,0.5]


Beam search ngram decoding for Transducer models (RNNT and HAT)
===============================================================

The similar script to evaluate an RNNT/HAT model with beam search decoding and N-gram models can be found at
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_transducer.py>`_

.. code-block::

python eval_beamsearch_ngram_transducer.py nemo_model_file=<path to the .nemo file of the model> \
input_manifest=<path to the evaluation JSON manifest file \
kenlm_model_file=<path to the binary KenLM model> \
beam_width=[<list of the beam widths, separated with commas>] \
beam_alpha=[<list of the beam alphas, separated with commas>] \
preds_output_folder=<optional folder to store the predictions> \
probs_cache_file=null \
decoding_strategy=<greedy_batch or maes decoding>
maes_prefix_alpha=[<list of the maes prefix alphas, separated with commas>] \
maes_expansion_gamma=[<list of the maes expansion gammas, separated with commas>] \
hat_subtract_ilm=<in case of HAT model: subtract internal LM or not (True/False)> \
hat_ilm_weight=[<in case of HAT model: list of the HAT internal LM weights, separated with commas>] \



.. _neural_rescoring:

****************
Expand Down
1 change: 1 addition & 0 deletions nemo/collections/asr/models/rnnt_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,7 @@ def transcribe(
"""
if paths2audio_files is None or len(paths2audio_files) == 0:
return {}

# We will store transcriptions here
hypotheses = []
all_hypotheses = []
Expand Down
Loading