Print best and worst results in a WER report. #2724

DanBmh · 2020-02-05T16:57:26Z

This is also correcting the error that the results with the highest WER instead of the lowest WER are printed.

community-tc-integration · 2020-02-05T16:57:46Z

No Taskcluster jobs started for this pull request

The `allowPullRequests` configuration for this repository (in `.taskcluster.yml` on the
default branch) does not allow starting tasks for this pull request.

lissyx · 2020-02-05T18:10:01Z

@DanBmh Can I ask you why you think it's useful to print the best ones ?

evaluate.py

DanBmh · 2020-02-05T20:17:21Z

I didnt know for some time that the examples are the worst predictions from the test set. I thought they were chosen randomly and i was wondering why they were always so bad. So i did ignore the examples after a while. After i found the flag description (which by the way says they are the best results as lower is better) i understood they are the worst results, This is makes more sense than just printing random ones too.

So i think printing both the best and the worst will provide new users with an intuitive way to see that we output the worst results.
Another benefit is that if you have a bad performing network you can still see the network learnt something:)

Maybe the best idea is to print median results too, so that you get a more realistic estimate of the prediction quality? Like this:
1
2
[...]
5
6
[...]
9
10

victornoriega · 2020-02-05T22:18:11Z

I think that when you're modeling and trying a lot of configurations, you want to know the flaws of your model but also the best it can show in which cases it can excel. I also think this should be an additional flag to DeepSpeech.py, not something by default.

lissyx · 2020-02-06T10:05:12Z

Maybe the best idea is to print median results too, so that you get a more realistic estimate of the prediction quality? Like this:
1
2
[...]
5
6
[...]
9
10

It makes sense, but I have to admit this is not something that ever crossed our mind.

evaluate.py

util/flags.py

util/evaluate_tools.py

lissyx · 2020-02-06T10:12:50Z

@DanBmh You might need to factorize / apply that to evaluate.py as well as evaluate_tflite.py.

DanBmh · 2020-02-06T10:14:34Z

evaluate_tflite.py does not print any samples. Shall i add it?

…s.py.

reuben · 2020-02-06T10:58:26Z

Thanks for the PR!

lissyx · 2020-02-06T12:12:44Z

evaluate_tflite.py does not print any samples. Shall i add it?

I'd say no then.

evaluate_tflite.py

util/evaluate_tools.py

util/flags.py

lissyx

LGTM, thanks!

lissyx · 2020-02-06T14:07:33Z

I've triggered some PR to have TaskCluster running on that.

lissyx · 2020-02-07T10:27:24Z

Thanks @DanBmh !

lock · 2020-03-10T19:46:16Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Print best and worst results in a WER report.

a2f05cc

lissyx reviewed Feb 5, 2020

View reviewed changes

evaluate.py Outdated Show resolved Hide resolved

Daniel added 2 commits February 6, 2020 10:55

Add median examples. Fix sorting.

272ed99

Revert linebreak.

369e3c9

Remove semicolon.

320e815

lissyx reviewed Feb 6, 2020

View reviewed changes

evaluate.py Outdated Show resolved Hide resolved

lissyx reviewed Feb 6, 2020

View reviewed changes

evaluate.py Outdated Show resolved Hide resolved

Restore order of imports.

a0b5d3e

lissyx reviewed Feb 6, 2020

View reviewed changes

evaluate.py Outdated Show resolved Hide resolved

lissyx reviewed Feb 6, 2020

View reviewed changes

evaluate.py Outdated Show resolved Hide resolved

lissyx reviewed Feb 6, 2020

View reviewed changes

evaluate.py Outdated Show resolved Hide resolved

lissyx reviewed Feb 6, 2020

View reviewed changes

util/flags.py Outdated Show resolved Hide resolved

lissyx reviewed Feb 6, 2020

View reviewed changes

util/evaluate_tools.py Show resolved Hide resolved

lissyx reviewed Feb 6, 2020

View reviewed changes

util/evaluate_tools.py Show resolved Hide resolved

Daniel added 3 commits February 6, 2020 11:39

Added summary to evaluate_tflite.py and moved method to evaluate_tool…

63a07e6

…s.py.

Add whitespace again.

9ec88b7

Dont need flags.

f514552

Reverse ordered loss again.

4186cbe

lissyx reviewed Feb 6, 2020

View reviewed changes

evaluate_tflite.py Outdated Show resolved Hide resolved

lissyx reviewed Feb 6, 2020

View reviewed changes

util/evaluate_tools.py Outdated Show resolved Hide resolved

lissyx reviewed Feb 6, 2020

View reviewed changes

util/flags.py Show resolved Hide resolved

Daniel added 3 commits February 6, 2020 13:31

Named example sections.

de92142

Moved summary printing to samples printing.

8cc91fa

Rename dataset param.

726cc20

lissyx approved these changes Feb 6, 2020

View reviewed changes

lissyx requested a review from reuben February 6, 2020 14:07

reuben approved these changes Feb 7, 2020

View reviewed changes

lissyx assigned DanBmh Feb 7, 2020

lissyx added the enhancement label Feb 7, 2020

lissyx merged commit 33efd9b into mozilla:master Feb 7, 2020

lock bot locked and limited conversation to collaborators Mar 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Print best and worst results in a WER report. #2724

Print best and worst results in a WER report. #2724

DanBmh commented Feb 5, 2020

community-tc-integration bot commented Feb 5, 2020

lissyx commented Feb 5, 2020

DanBmh commented Feb 5, 2020

victornoriega commented Feb 5, 2020

lissyx commented Feb 6, 2020

lissyx commented Feb 6, 2020

DanBmh commented Feb 6, 2020

reuben commented Feb 6, 2020

lissyx commented Feb 6, 2020

lissyx left a comment

lissyx commented Feb 6, 2020

lissyx commented Feb 7, 2020

lock bot commented Mar 10, 2020

Print best and worst results in a WER report. #2724

Print best and worst results in a WER report. #2724

Conversation

DanBmh commented Feb 5, 2020

community-tc-integration bot commented Feb 5, 2020

lissyx commented Feb 5, 2020

DanBmh commented Feb 5, 2020

victornoriega commented Feb 5, 2020

lissyx commented Feb 6, 2020

lissyx commented Feb 6, 2020

DanBmh commented Feb 6, 2020

reuben commented Feb 6, 2020

lissyx commented Feb 6, 2020

lissyx left a comment

Choose a reason for hiding this comment

lissyx commented Feb 6, 2020

lissyx commented Feb 7, 2020

lock bot commented Mar 10, 2020