Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions egs/wsj/s5/steps/get_ctm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ if [ $stage -le 0 ]; then
set -o pipefail '&&' mkdir -p $dir/score_LMWT/ '&&' \
lattice-1best --lm-scale=LMWT "ark:gunzip -c $lats|" ark:- \| \
lattice-align-words-lexicon $lang/phones/align_lexicon.int $model ark:- ark:- \| \
lattice-1best ark:- ark:- \| \
nbest-to-ctm --frame-shift=$frame_shift --print-silence=$print_silence ark:- - \| \
utils/int2sym.pl -f 5 $lang/words.txt \| \
$filter_cmd '>' $dir/score_LMWT/$name.ctm || exit 1;
Expand Down
1 change: 1 addition & 0 deletions egs/wsj/s5/steps/get_train_ctm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ if [ $stage -le 0 ]; then
"ark:utils/sym2int.pl --map-oov $oov -f 2- $lang/words.txt < $sdata/JOB/text |" \
'' '' ark:- \| \
lattice-align-words-lexicon $lang/phones/align_lexicon.int $model ark:- ark:- \| \
lattice-1best ark:- ark:- \| \
nbest-to-ctm --frame-shift=$frame_shift --print-silence=$print_silence ark:- - \| \
utils/int2sym.pl -f 5 $lang/words.txt \| \
gzip -c '>' $dir/ctm.JOB.gz || exit 1
Expand Down
10 changes: 9 additions & 1 deletion src/latbin/nbest-to-ctm.cc
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,12 @@ int main(int argc, char *argv[]) {
"and must be in CompactLattice form where the transition-ids on the arcs\n"
"have been aligned with the word boundaries... typically the input will\n"
"be a lattice that has been piped through lattice-1best and then\n"
"lattice-align-words. It outputs ctm format (with integers in place of words),\n"
"lattice-align-words. On the other hand, whenever we directly pipe\n"
"the output of lattice-align-words-lexicon into nbest-to-ctm,\n"
"we need to put the command `lattice-1best ark:- ark:-` between them,\n"
"because even for linear lattices, lattice-align-words-lexicon can\n"
"in certain cases produce non-linear outputs (due to disambiguity\n"
"in the lexicon). It outputs ctm format (with integers in place of words),\n"
"assuming the frame length is 0.01 seconds by default (change this with the\n"
"--frame-length option). Note: the output is in the form\n"
"<utterance-id> 1 <begin-time> <end-time> <word-id>\n"
Expand All @@ -42,6 +47,9 @@ int main(int argc, char *argv[]) {
"Usage: nbest-to-ctm [options] <aligned-linear-lattice-rspecifier> <ctm-wxfilename>\n"
"e.g.: lattice-1best --acoustic-weight=0.08333 ark:1.lats | \\\n"
" lattice-align-words data/lang/phones/word_boundary.int exp/dir/final.mdl ark:- ark:- | \\\n"
" nbest-to-ctm ark:- 1.ctm\n"
"e.g.: lattice-align-words-lexicon data/lang/phones/align_lexicon.int exp/dir/final.mdl ark:1.lats ark:- | \\\n"
" lattice-1best ark:- ark:- | \\\n"
" nbest-to-ctm ark:- 1.ctm\n";

ParseOptions po(usage);
Expand Down