-
Notifications
You must be signed in to change notification settings - Fork 5.4k
[src] Incremental Lattice Determinization for Low-Latency WFST Decoder #3317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
49 commits
Select commit
Hold shift + click to select a range
240f0e4
Merge pull request #1 from kaldi-asr/master
chenzhehuai 60f2bcf
Merge pull request #2 from kaldi-asr/master
chenzhehuai c7eb4c5
Merge pull request #4 from kaldi-asr/master
chenzhehuai 25706e7
Merge pull request #5 from kaldi-asr/master
chenzhehuai ac49815
Merge pull request #6 from kaldi-asr/master
chenzhehuai 6d5e966
Merge pull request #8 from kaldi-asr/master
chenzhehuai 751d8bf
Merge pull request #10 from kaldi-asr/master
chenzhehuai d8ff7ee
make fst templates inline to eliminate linking errors in other places
chenzhehuai ada7ea7
Merge pull request #17 from kaldi-asr/master
chenzhehuai 6f366c1
Merge pull request #27 from kaldi-asr/master
chenzhehuai 1f50f06
WIP
be6fba2
worse wer & ower
6f92369
clean code
8080697
this commit is for sanity check
b302f12
code clean
7c0f7d7
each time we determinize the piece of lattice, instead of going all …
bb4e68f
bug fix:
86595f9
test in libri speech
080e5b4
clean; without class
d8907a4
add class LatticeIncrementalDeterminizer
228d8a2
add config_.determinize_max_active & redeterminize=false
c350305
update best config; add re-determinization from frame 0 if AppendLatt…
3844389
1. add time profiling for baseline lattice-faster-decoder for compari…
90f3ea7
code refine
9111173
update final weight by extra_cost-alpha, see sheet "ver 3"
6af8f62
WIP
40cf7ff
fix bugs and add sanity check
c5f0a8e
enable det
467abd8
clean code
612d398
[experimental] new det algorithm (#31)
chenzhehuai 92ce13c
adding redet frames
b4ed30c
add eps removal; 1oco
5651200
bug fix when --epsilon-removal=1 --redeterminize-max-frames=10
7401fe4
code refine
e5cef12
bug fix
ecae786
code refine
35a7abc
We need to be careful about the case where the start state of the
39d4181
code refine
8e1648d
Do the following modification. Results can be referred to sheet "ver …
9a0873e
make terms consistent with the paper
4448c1f
code refine according to Hainan's comments
0a4c9bb
add final-prune-after-determinize
b4416f5
more comments
a624b3e
bug fix
6438a3b
refine
15cdab7
add online decoder
9566370
refine
10a597a
refine
c7043c7
fix the bug to introduce one more eps arc sometime
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,183 @@ | ||
| // bin/latgen-incremental-mapped.cc | ||
|
|
||
| // Copyright 2019 Zhehuai Chen | ||
|
|
||
| // See ../../COPYING for clarification regarding multiple authors | ||
| // | ||
| // Licensed under the Apache License, Version 2.0 (the "License"); | ||
| // you may not use this file except in compliance with the License. | ||
| // You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| // KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED | ||
| // WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE, | ||
| // MERCHANTABLITY OR NON-INFRINGEMENT. | ||
| // See the Apache 2 License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| #include "base/kaldi-common.h" | ||
| #include "util/common-utils.h" | ||
| #include "tree/context-dep.h" | ||
| #include "hmm/transition-model.h" | ||
| #include "fstext/fstext-lib.h" | ||
| #include "decoder/decoder-wrappers.h" | ||
| #include "decoder/decodable-matrix.h" | ||
| #include "base/timer.h" | ||
|
|
||
| int main(int argc, char *argv[]) { | ||
| try { | ||
| using namespace kaldi; | ||
| typedef kaldi::int32 int32; | ||
| using fst::SymbolTable; | ||
| using fst::Fst; | ||
| using fst::StdArc; | ||
|
|
||
| const char *usage = | ||
| "Generate lattices, reading log-likelihoods as matrices\n" | ||
| " (model is needed only for the integer mappings in its transition-model)\n" | ||
| "The lattice determinization algorithm here can operate\n" | ||
| "incrementally.\n" | ||
| "Usage: latgen-incremental-mapped [options] trans-model-in " | ||
| "(fst-in|fsts-rspecifier) loglikes-rspecifier" | ||
| " lattice-wspecifier [ words-wspecifier [alignments-wspecifier] ]\n"; | ||
| ParseOptions po(usage); | ||
| Timer timer; | ||
| bool allow_partial = false; | ||
| BaseFloat acoustic_scale = 0.1; | ||
| LatticeIncrementalDecoderConfig config; | ||
|
|
||
| std::string word_syms_filename; | ||
| config.Register(&po); | ||
| po.Register("acoustic-scale", &acoustic_scale, | ||
| "Scaling factor for acoustic likelihoods"); | ||
|
|
||
| po.Register("word-symbol-table", &word_syms_filename, | ||
| "Symbol table for words [for debug output]"); | ||
| po.Register("allow-partial", &allow_partial, | ||
| "If true, produce output even if end state was not reached."); | ||
|
|
||
| po.Read(argc, argv); | ||
|
|
||
| if (po.NumArgs() < 4 || po.NumArgs() > 6) { | ||
| po.PrintUsage(); | ||
| exit(1); | ||
| } | ||
|
|
||
| std::string model_in_filename = po.GetArg(1), fst_in_str = po.GetArg(2), | ||
| feature_rspecifier = po.GetArg(3), lattice_wspecifier = po.GetArg(4), | ||
| words_wspecifier = po.GetOptArg(5), | ||
| alignment_wspecifier = po.GetOptArg(6); | ||
|
|
||
| TransitionModel trans_model; | ||
| ReadKaldiObject(model_in_filename, &trans_model); | ||
|
|
||
| bool determinize = true; | ||
| CompactLatticeWriter compact_lattice_writer; | ||
| LatticeWriter lattice_writer; | ||
| if (!(determinize ? compact_lattice_writer.Open(lattice_wspecifier) | ||
| : lattice_writer.Open(lattice_wspecifier))) | ||
| KALDI_ERR << "Could not open table for writing lattices: " | ||
| << lattice_wspecifier; | ||
|
|
||
| Int32VectorWriter words_writer(words_wspecifier); | ||
|
|
||
| Int32VectorWriter alignment_writer(alignment_wspecifier); | ||
|
|
||
| fst::SymbolTable *word_syms = NULL; | ||
| if (word_syms_filename != "") | ||
| if (!(word_syms = fst::SymbolTable::ReadText(word_syms_filename))) | ||
| KALDI_ERR << "Could not read symbol table from file " << word_syms_filename; | ||
|
|
||
| double tot_like = 0.0; | ||
| kaldi::int64 frame_count = 0; | ||
| int num_success = 0, num_fail = 0; | ||
|
|
||
| if (ClassifyRspecifier(fst_in_str, NULL, NULL) == kNoRspecifier) { | ||
| SequentialBaseFloatMatrixReader loglike_reader(feature_rspecifier); | ||
| // Input FST is just one FST, not a table of FSTs. | ||
| Fst<StdArc> *decode_fst = fst::ReadFstKaldiGeneric(fst_in_str); | ||
| timer.Reset(); | ||
|
|
||
| { | ||
| LatticeIncrementalDecoder decoder(*decode_fst, trans_model, config); | ||
|
|
||
| for (; !loglike_reader.Done(); loglike_reader.Next()) { | ||
| std::string utt = loglike_reader.Key(); | ||
| Matrix<BaseFloat> loglikes(loglike_reader.Value()); | ||
| loglike_reader.FreeCurrent(); | ||
| if (loglikes.NumRows() == 0) { | ||
| KALDI_WARN << "Zero-length utterance: " << utt; | ||
| num_fail++; | ||
| continue; | ||
| } | ||
|
|
||
| DecodableMatrixScaledMapped decodable(trans_model, loglikes, | ||
| acoustic_scale); | ||
|
|
||
| double like; | ||
| if (DecodeUtteranceLatticeIncremental( | ||
| decoder, decodable, trans_model, word_syms, utt, acoustic_scale, | ||
| determinize, allow_partial, &alignment_writer, &words_writer, | ||
| &compact_lattice_writer, &lattice_writer, &like)) { | ||
| tot_like += like; | ||
| frame_count += loglikes.NumRows(); | ||
| num_success++; | ||
| } else { | ||
| num_fail++; | ||
| } | ||
| } | ||
| } | ||
| delete decode_fst; // delete this only after decoder goes out of scope. | ||
| } else { // We have different FSTs for different utterances. | ||
| SequentialTableReader<fst::VectorFstHolder> fst_reader(fst_in_str); | ||
| RandomAccessBaseFloatMatrixReader loglike_reader(feature_rspecifier); | ||
| for (; !fst_reader.Done(); fst_reader.Next()) { | ||
| std::string utt = fst_reader.Key(); | ||
| if (!loglike_reader.HasKey(utt)) { | ||
| KALDI_WARN << "Not decoding utterance " << utt | ||
| << " because no loglikes available."; | ||
| num_fail++; | ||
| continue; | ||
| } | ||
| const Matrix<BaseFloat> &loglikes = loglike_reader.Value(utt); | ||
| if (loglikes.NumRows() == 0) { | ||
| KALDI_WARN << "Zero-length utterance: " << utt; | ||
| num_fail++; | ||
| continue; | ||
| } | ||
| LatticeIncrementalDecoder decoder(fst_reader.Value(), trans_model, config); | ||
| DecodableMatrixScaledMapped decodable(trans_model, loglikes, acoustic_scale); | ||
| double like; | ||
| if (DecodeUtteranceLatticeIncremental( | ||
| decoder, decodable, trans_model, word_syms, utt, acoustic_scale, | ||
| determinize, allow_partial, &alignment_writer, &words_writer, | ||
| &compact_lattice_writer, &lattice_writer, &like)) { | ||
| tot_like += like; | ||
| frame_count += loglikes.NumRows(); | ||
| num_success++; | ||
| } else { | ||
| num_fail++; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| double elapsed = timer.Elapsed(); | ||
| KALDI_LOG << "Time taken " << elapsed | ||
| << "s: real-time factor assuming 100 frames/sec is " | ||
| << (elapsed * 100.0 / frame_count); | ||
| KALDI_LOG << "Done " << num_success << " utterances, failed for " << num_fail; | ||
| KALDI_LOG << "Overall log-likelihood per frame is " << (tot_like / frame_count) | ||
| << " over " << frame_count << " frames."; | ||
|
|
||
| delete word_syms; | ||
| if (num_success != 0) | ||
| return 0; | ||
| else | ||
| return 1; | ||
| } catch (const std::exception &e) { | ||
| std::cerr << e.what(); | ||
| return -1; | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.