Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ngram #6063

Merged
merged 11 commits into from
Mar 1, 2023
Merged

Ngram #6063

merged 11 commits into from
Mar 1, 2023

Conversation

karpnv
Copy link
Collaborator

@karpnv karpnv commented Feb 20, 2023

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: ASR

Changelog

  • Add preprocess() and separate_punctuation() functions for texts with punctuation and capitalization (PC)
  • Run examples/asr/transcribe_speech.py with beamsearch decoder (beam_strategy=beam)
  • Save N-best hypothesis into the "beams" field of manifest file
    Flashlight decoder is not supported yet

Usage

  • speech_to_text_eval.py
python3 ./examples/asr/speech_to_text_eval.py   \
model_path=/some/path/ctc_conformer_bpe.nemo   \
dataset_manifest=some/path/test_pc.json  \
output_filename=some/path/test_pc_ctc1.json   \
ctc_decoding.strategy=beam \
ctc_decoding.beam.kenlm_path=/some/path/a1.kenlm \
ctc_decoding.beam.return_best_hypothesis=false \
do_lowercase=true \
rm_punctuation=true
  • eval_beamsearch_ngram.py
python3 ./scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py  \
nemo_model_file=/some/path/ctc_conformer_bpe.nemo   \
input_manifest=some/path/test_pc.json  \
kenlm_model_file=/some/path/a1.kenlm \
decoding_mode=beamsearch_ngram \
decoding_strategy=beam \
do_lowercase=true \
rm_punctuation=true 

PR Type:

  • New Feature

Nikolay Karpov added 3 commits February 20, 2023 10:10
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
@github-actions github-actions bot added the ASR label Feb 20, 2023
Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

examples/asr/speech_to_text_eval.py Outdated Show resolved Hide resolved
examples/asr/speech_to_text_eval.py Outdated Show resolved Hide resolved
examples/asr/transcribe_speech.py Outdated Show resolved Hide resolved
nemo/collections/asr/parts/utils/transcribe_utils.py Outdated Show resolved Hide resolved
nemo/collections/asr/parts/utils/transcribe_utils.py Outdated Show resolved Hide resolved
karpnv and others added 4 commits February 21, 2023 02:33
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking great ! I will ask @fayejf for another review to see if everything is ok from her side

titu1994
titu1994 previously approved these changes Feb 22, 2023
@titu1994 titu1994 requested a review from fayejf February 22, 2023 20:07
Copy link
Collaborator

@fayejf fayejf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work. small comments. And could we specify the value how many best hypothesis we would like? say top 3?

@@ -144,6 +167,8 @@ def main(cfg: EvaluationConfig):
else:
logging.info(f'Got {metric_name} of {metric_value}')

logging.info(f'Dataset WER/CER ' + str(round(100 * wer, 2)) + "%/" + str(round(100 * cer, 2)) + "%")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to have both wer and cer here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to see both metrics in one shot

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then how about merge line 170 with above if else condition and reduce redundant logging

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@karpnv
Copy link
Collaborator Author

karpnv commented Mar 1, 2023

great work. small comments. And could we specify the value how many best hypothesis we would like? say top 3?

Yes, we can set ctc_decoding.beam.beam_size=3, it will affect returning number of hypothesis.

Copy link
Collaborator

@fayejf fayejf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

@titu1994 titu1994 merged commit fc30243 into NVIDIA:main Mar 1, 2023
titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
* do_lowercase, rm_punctuation

Signed-off-by: Nikolay Karpov <[email protected]>

* support beam_strategy = beam

Signed-off-by: Nikolay Karpov <[email protected]>

* black

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix config and^Cunctuation capitalization

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm math

Signed-off-by: Nikolay Karpov <[email protected]>

* else TypeError

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: fayejf <[email protected]>
@@ -289,9 +290,26 @@ def write_transcription(
else:
pred_text_attr_name = 'pred_text'

if isinstance(transcriptions[0], rnnt_utils.Hypothesis): # List[rnnt_utils.Hypothesis]
best_hyps = transcriptions
assert cfg.ctc_decoding.beam.return_best_hypothesis, "Works only with return_best_hypothesis=true"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized this should be rnnt instead ctc here. Can we fix it, please? @karpnv

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not sure about hybrid ctc rnnt model when we call write_transcription inside transcribe_speech.py for that model. Any thought? @VahidooX @titu1994

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually write_transcription should handle both ctc and rnnt decoding.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part works for both ctc_decoding and rnnt_decoding. Just remove assert will fix it

hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* do_lowercase, rm_punctuation

Signed-off-by: Nikolay Karpov <[email protected]>

* support beam_strategy = beam

Signed-off-by: Nikolay Karpov <[email protected]>

* black

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix config and^Cunctuation capitalization

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm math

Signed-off-by: Nikolay Karpov <[email protected]>

* else TypeError

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: fayejf <[email protected]>
Signed-off-by: hsiehjackson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants