-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ngram #6063
Ngram #6063
Conversation
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments
Signed-off-by: Nikolay Karpov <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking great ! I will ask @fayejf for another review to see if everything is ok from her side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work. small comments. And could we specify the value how many best hypothesis we would like? say top 3?
@@ -144,6 +167,8 @@ def main(cfg: EvaluationConfig): | |||
else: | |||
logging.info(f'Got {metric_name} of {metric_value}') | |||
|
|||
logging.info(f'Dataset WER/CER ' + str(round(100 * wer, 2)) + "%/" + str(round(100 * cer, 2)) + "%") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to have both wer and cer here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to see both metrics in one shot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then how about merge line 170 with above if else condition and reduce redundant logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Yes, we can set ctc_decoding.beam.beam_size=3, it will affect returning number of hypothesis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM
* do_lowercase, rm_punctuation Signed-off-by: Nikolay Karpov <[email protected]> * support beam_strategy = beam Signed-off-by: Nikolay Karpov <[email protected]> * black Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix config and^Cunctuation capitalization Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm math Signed-off-by: Nikolay Karpov <[email protected]> * else TypeError Signed-off-by: Nikolay Karpov <[email protected]> --------- Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: fayejf <[email protected]>
@@ -289,9 +290,26 @@ def write_transcription( | |||
else: | |||
pred_text_attr_name = 'pred_text' | |||
|
|||
if isinstance(transcriptions[0], rnnt_utils.Hypothesis): # List[rnnt_utils.Hypothesis] | |||
best_hyps = transcriptions | |||
assert cfg.ctc_decoding.beam.return_best_hypothesis, "Works only with return_best_hypothesis=true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized this should be rnnt instead ctc here. Can we fix it, please? @karpnv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually write_transcription
should handle both ctc and rnnt decoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part works for both ctc_decoding and rnnt_decoding. Just remove assert will fix it
* do_lowercase, rm_punctuation Signed-off-by: Nikolay Karpov <[email protected]> * support beam_strategy = beam Signed-off-by: Nikolay Karpov <[email protected]> * black Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix config and^Cunctuation capitalization Signed-off-by: Nikolay Karpov <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm math Signed-off-by: Nikolay Karpov <[email protected]> * else TypeError Signed-off-by: Nikolay Karpov <[email protected]> --------- Signed-off-by: Nikolay Karpov <[email protected]> Signed-off-by: Nikolay Karpov <[email protected]> Co-authored-by: Nikolay Karpov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: fayejf <[email protected]> Signed-off-by: hsiehjackson <[email protected]>
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Collection: ASR
Changelog
Flashlight decoder is not supported yet
Usage
PR Type: