add compute_sentence_probs_arpa.py#2538
Merged
danpovey merged 5 commits intokaldi-asr:masterfrom Jul 17, 2018
DongjiGao:sentence_prob
Merged
add compute_sentence_probs_arpa.py#2538danpovey merged 5 commits intokaldi-asr:masterfrom DongjiGao:sentence_prob
danpovey merged 5 commits intokaldi-asr:masterfrom
DongjiGao:sentence_prob
Conversation
…bability of input file given arpa language model
Contributor
|
looks good and I've tested it. |
Contributor
|
has it been tested in python2 and python3, and compared with SRILM output?
…On Sat, Jul 7, 2018 at 3:20 PM, Xiaohui Zhang ***@***.***> wrote:
looks good and I've tested it.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2538 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuwobkgv5HKdhJtaAucjCxDqlHL_3ks5uEQnggaJpZM4VGdqr>
.
|
Contributor
Author
|
Yes. The only difference between python2 and 3 is the precision: |
Contributor
|
ready for merging? |
danpovey
reviewed
Jul 11, 2018
Contributor
There was a problem hiding this comment.
This should be 'log probability'.
And wouldn't it be more convenient for it to default to outputting log base e (like Kaldi uses) and maye we can add an option to output log base 10 if it's needed in the future?
Contributor
Author
There was a problem hiding this comment.
The log base in arpa is 10, so I directly used it. I will fix it and add a log base option.
Contributor
Author
|
ready to merge? |
danpovey
reviewed
Jul 16, 2018
Contributor
There was a problem hiding this comment.
it would be better to evaluate math.log(10, args.log_base) outside the loop.
dpriver
pushed a commit
to dpriver/kaldi
that referenced
this pull request
Sep 13, 2018
…e probs given arpa (kaldi-asr#2538)
Skaiste
pushed a commit
to Skaiste/idlak
that referenced
this pull request
Sep 26, 2018
…e probs given arpa (kaldi-asr#2538)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adding compute_sentence_probs_arpa.py that can compute sentence log probability of each sentence from the input file given an arpa language model and write the log prob into an output file.
This script has been checked with SRILM for 10k sentences.
Can also support stdin and stdout.