-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print best and worst results in a WER report. #2724
Conversation
No Taskcluster jobs started for this pull requestThe `allowPullRequests` configuration for this repository (in `.taskcluster.yml` on the
default branch) does not allow starting tasks for this pull request. |
@DanBmh Can I ask you why you think it's useful to print the best ones ? |
I didnt know for some time that the examples are the worst predictions from the test set. I thought they were chosen randomly and i was wondering why they were always so bad. So i did ignore the examples after a while. After i found the flag description (which by the way says they are the best results as lower is better) i understood they are the worst results, This is makes more sense than just printing random ones too. So i think printing both the best and the worst will provide new users with an intuitive way to see that we output the worst results. Maybe the best idea is to print median results too, so that you get a more realistic estimate of the prediction quality? Like this: |
I think that when you're modeling and trying a lot of configurations, you want to know the flaws of your model but also the best it can show in which cases it can excel. I also think this should be an additional flag to DeepSpeech.py, not something by default. |
It makes sense, but I have to admit this is not something that ever crossed our mind. |
@DanBmh You might need to factorize / apply that to |
evaluate_tflite.py does not print any samples. Shall i add it? |
Thanks for the PR! |
I'd say no then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
I've triggered some PR to have TaskCluster running on that. |
Thanks @DanBmh ! |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This is also correcting the error that the results with the highest WER instead of the lowest WER are printed.