Skip to content

add compute_sentence_probs_arpa.py#2538

Merged
danpovey merged 5 commits intokaldi-asr:masterfrom
DongjiGao:sentence_prob
Jul 17, 2018
Merged

add compute_sentence_probs_arpa.py#2538
danpovey merged 5 commits intokaldi-asr:masterfrom
DongjiGao:sentence_prob

Conversation

@DongjiGao
Copy link
Contributor

Adding compute_sentence_probs_arpa.py that can compute sentence log probability of each sentence from the input file given an arpa language model and write the log prob into an output file.
This script has been checked with SRILM for 10k sentences.
Can also support stdin and stdout.

…bability of input file given arpa language model
@xiaohui-zhang
Copy link
Contributor

looks good and I've tested it.

@danpovey
Copy link
Contributor

danpovey commented Jul 7, 2018 via email

@DongjiGao
Copy link
Contributor Author

DongjiGao commented Jul 7, 2018

Yes. The only difference between python2 and 3 is the precision:
e.g -36.599961699999994 in python3 and -36.5999617 in python2

@xiaohui-zhang
Copy link
Contributor

xiaohui-zhang commented Jul 11, 2018

ready for merging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 'log probability'.
And wouldn't it be more convenient for it to default to outputting log base e (like Kaldi uses) and maye we can add an option to output log base 10 if it's needed in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log base in arpa is 10, so I directly used it. I will fix it and add a log base option.

@DongjiGao
Copy link
Contributor Author

ready to merge?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be better to evaluate math.log(10, args.log_base) outside the loop.

@danpovey danpovey merged commit 79883f3 into kaldi-asr:master Jul 17, 2018
dpriver pushed a commit to dpriver/kaldi that referenced this pull request Sep 13, 2018
Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments