-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Modify TransitionModel for more compact chain-model graphs #1105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@naxingyu, any progress on this? Any intermediate work to push? |
|
Yes, I'll commit something today. X. On 2016/10/17 3:38, Daniel Povey wrote:
|
|
@danpovey It's still a heavy draft, lots of todos, tuple-related modification, testing, documentation, etc., but the basic idea is structured. The 2nd version of GetPdfInfo is implemented using brute force enumeration. Please check if I get the intension correct or not. |
danpovey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice from the travis build that hmm-utils.cc failed.
src/hmm/hmm-topology.cc
Outdated
| entries_[i].resize(thist_sz); | ||
| for (int32 j = 0 ; j < thist_sz; j++) { | ||
| ReadBasicType(is, binary, &(entries_[i][j].pdf_class)); | ||
| ReadBasicType(is, binary, &(entries_[i][j].self_loop_pdf_class)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember the reading code needs to be back compatible to previously written models.
src/hmm/hmm-topology.cc
Outdated
| KALDI_ERR << "HmmTopology::Check(), duplicate transition found."; | ||
| if (dst_state == k) { // self_loop... | ||
| KALDI_ASSERT(entries_[i][j].pdf_class != kNoPdf && "Nonemitting states cannot have self-loops."); | ||
| KALDI_ASSERT(entries_[i][j].self_loop_pdf_class != kNoPdf && "Nonemitting states cannot have self-loops."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line length.
| /// but may be different to enable us to hardwire sharing of state, and may be | ||
| /// equal to \ref kNoPdf == -1 in order to specify nonemitting states (unusual). | ||
| int32 pdf_class; | ||
| int32 self_loop_pdf_class; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should probably have a comment for this.
src/hmm/hmm-topology.h
Outdated
|
|
||
| explicit HmmState(int32 p): pdf_class(p) { } | ||
| explicit HmmState(int32 p): pdf_class(p),self_loop_pdf_class(p) { } | ||
| explicit HmmState(int32 p, int32 self_p) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to rename these to pdf_class and self_loop_pdf_class [self-constructor still works even though it's the same as the variable names.]
src/hmm/hmm-topology.h
Outdated
| } | ||
|
|
||
| HmmState(): pdf_class(-1) { } | ||
| HmmState(): pdf_class(-1),self_loop_pdf_class(-1) { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after comma, and put 2 spaces before HmmState().
src/hmm/transition-model.cc
Outdated
| to_hmm_state_list[std::make_pair(phone, pdf_class)].push_back(j); | ||
|
|
||
| if (SelfLoopEqualsForward()) { | ||
| // this branch deals with when self_loop_pdf_class is always the same as the pdf_class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could comment that this is the case for normal models, but not for chain models.
src/hmm/transition-model.cc
Outdated
| for (int32 j = 0; j < static_cast<int32>(pdf_info[i].size()); j++) { | ||
| int32 pdf_class = pdf_class_pairs[i][j].first, | ||
| self_loop_pdf_class = pdf_class_pairs[i][j].second; | ||
| const std::vector<int32> &state_vec = to_hmm_state_list[phone][std::make_pair(pdf_class, self_loop_pdf_class)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line length
src/hmm/transition-model.cc
Outdated
| } | ||
| } else { | ||
| std::vector<std::vector<std::vector<std::pair<int32, int32> > > > pdf_info; | ||
| std::vector<std::vector<std::pair<int32, int32> > > pdf_class_pairs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a comment saying what pdf_info represents.-- and of course you have to figure out pdf_class_pairs, and also explain what it is.
2511125 to
aaeb2e4
Compare
|
Thanks! |
aaeb2e4 to
4a3a250
Compare
|
While validating the gmm models, I found that this line only works in Perl 5.12 and later. Modifying the while loop to will work for most version of Perl. |
|
@danpovey comments are resolved. On 2016/10/19 2:31, Daniel Povey wrote:
|
|
Regarding readdir, do you have time to make a separate PR to fix that? Or is the fix in this PR? |
|
I'm running the rm recipe. Modified version created exactly same model, X. 在 2016/10/21 10:29, Daniel Povey 写道:
|
|
As long as the WER is in the same ballpark it's OK. None of this is On Tue, Oct 25, 2016 at 2:54 PM, Xingyu Na notifications@github.com wrote:
|
|
I'll make a separate PR for readdir. |
|
Probably it didn't work on travis because you left 'inline' in the declaration. |
703cb90 to
bbfef0a
Compare
|
On 2016/10/26 10:47, Daniel Povey wrote:
|
|
I'm stuck in building chain tree for a few days. I started with such a topo. The mono.{mdl,tree} are correctly initialized with tuple-type transitions. And the converted alignments seems good. However, the output tree does not contain |
|
The place where it's going wrong might be in acc-tree-stats. The problem might be in the function TransitionIdToPdfClass(). Dan On Tue, Nov 1, 2016 at 5:00 AM, Xingyu Na notifications@github.com wrote:
|
|
Nice. A condition of IsSelfLoop() before getting pdf_class works. :) X. On 2016/11/2 7:15, Daniel Povey wrote:
|
|
Now that I get pdf_ids and self_loop_pdf_ids separately for the central X. On 2016/11/2 7:15, Daniel Povey wrote:
|
|
Suppose, for phone aa_B, I have 14 forward pdfs and 12 self-loop pdfs. In the old format, there will be about 26 transition states. In the new format, I need to find possible combinations in the contexts of this phone from the 168 candidates. A naive progressive algorithm is Do you think that this could shrink the number of pdf pairs from 168 to nearly half of the old format, ie 13? I'm trying though... |
|
The example above ended with 53 transition states, compared with 26 transition states using the old topology. And that will double the size of the HMM fst. Is there something wrong with the algorithm? |
|
A larger number of transition-states is expected. It may not impact the Regarding your algorithm-- I have a couple of issues with it: it treats the I would use a recursive algorithm: something like the following // 'context' is the context-window of phones, of On Wed, Nov 2, 2016 at 5:14 AM, Xingyu Na notifications@github.com wrote:
|
danpovey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on this work in progress commit.
| leftmost_questions_truncate=10 | ||
| tree_stats_opts= | ||
| cluster_phones_opts= | ||
| cluster_phones_opts="--pdf-class-list=0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it necessary for some reason to change this?
This will have the effect that for finding the phone classes, it just uses the 1st transition of the phone, rather than the self-loop. it's not clear to me that this would be better.
Actually, I can see why this would have been necessary when you had that bug and the tree was not being built right, but this is probably not needed now.
| cp $alidir/cmvn_opts $dir 2>/dev/null # cmn/cmvn option. | ||
| cp $alidir/delta_opts $dir 2>/dev/null # delta option. | ||
|
|
||
| utils/lang/check_phones_compatible.sh $lang/phones.txt $alidir/phones.txt || exit 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason you deleted these lines? Perhaps something went wrong in merge?
| print("<State> 1 <PdfClass> 1 <Transition> 1 0.5 <Transition> 2 0.5 </State>") | ||
| print("<State> 2 </State>") | ||
| print("<State> 0 <PdfClass> 0 <SelfLoopPdfClass> 1 <Transition> 0 0.5 <Transition> 1 0.5 </State>") | ||
| #print("<State> 1 <PdfClass> 1 <Transition> 1 0.5 <Transition> 2 0.5 </State>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this commented-out line. But you might want to put a comment at the top saying that this script was modified around such-and-such a date, when the code was extended to support having a different pdf-class on the self loop.
| } else { | ||
| WriteIntegerVector(os, binary, phones_); | ||
| WriteIntegerVector(os, binary, phone2idx_); | ||
| if (!is_hmm) WriteBasicType(os, binary, static_cast<int32>(-1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment that this -1 is a signal that the object has the new, extended format with SelfLoopPdfClass.
src/hmm/transition-model.cc
Outdated
| if (pdf_class != kNoPdf) { | ||
| to_hmm_state_list[std::make_pair(phone, pdf_class)].push_back(j); | ||
|
|
||
| if (IsHmm()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is too long.. you should probably break it out into a couple of different functions at least. If you don't need to export something to the header you can declare the function static.
| @@ -178,6 +178,50 @@ void ContextDependency::Read (std::istream &is, bool binary) { | |||
| to_pdf_ = to_pdf; | |||
| } | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put a newline after ( and move things to the left so that you don't go so much above 80 characters.
src/tree/context-dep.cc
Outdated
| std::vector<EventAnswerType> self_loop_pdfs; // self-loop pdfs that can be at this pos as this phone. | ||
| to_pdf_->MultiMap(vec, &self_loop_pdfs); | ||
| SortAndUniq(&self_loop_pdfs); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not use the term 'brute force' for this.. what I meant by brute force was, finding the exact list of pairs by enumerating all triphones (or n-phones, in general). Let's call this 'overgeneration' of pairs.
|
@naxingyu, any progress with this? I'm looking forward to finding out how much smaller the graph is (and hopefully there will be a reduction in decoding time as well, and maybe even small WER improvements due to better pruning). |
|
I'm testing the recursive algorithm and found that there might be a problem with it. The algorithm might not find all possible pdf pairs, e.g. the pairs that only appear at phone sequence borders. This will cause a convert-ali failure. For instance, the alignment of this utterance in the RM training set does not end with sil. The final phone window is (76, 159, 0). The pdf pair is (216, 373). It only appear in this biphone context. However, the recursive algorithm only generate the pairs from all triphones, (76, 159, 1) to (96, 159, 194). |
…ges can be reverted after TransitionIdToPdf is reworked and made inline.
2baaf04 to
7788327
Compare
|
@danpovey rebase works. ready now. |
| // length N, with -1 for those positions where phones | ||
| // that are currently unknown, treated as wildcards; at least | ||
| // the central phone [position P] must be a real phone, i.e. | ||
| // not -1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused now about how you solved the issue about word-boundaries.
The comment says that -1 is used for positions where we have not yet expanded the phone.
However, the actual code seems to be using 0 for the "not-expanded"/wildcard phone.
But for the left and right context when we hit the edge of the file, the phone should be set to zero. Not including zero as an option for positions other than position P_ would cause a problem. But I don't see how this can happen, given that you are (in the code) using 0 for the wildcard.
src/tree/context-dep.cc
Outdated
| int32 pdf_class = pdf_class_pairs[phone][j].first, | ||
| self_loop_pdf_class = pdf_class_pairs[phone][j].second; | ||
| for (size_t win_start = 0; win_start < phone_window.size(); win_start++) { | ||
| if (win_start != P_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong indentation... and I think you should use -1 for the wildcard positions here, so you can use 0 for the context. And then you should probably modify EnumeratePairs() so that when it loops over the phones, it also includes 0 [meaning: hitting the edge of the file] in the context. I'm surprised this hasn't caused a crash... I thought you had been hit by this issue before and fixed it, so now I wonder what the fix was.
|
I forgot to update this comment... Yes I chose to use 0 for wildcards so X. On 2016/11/21 12:26, Daniel Povey wrote:
|
|
No, the code is not right... it will lead to a crash for some trees. You On Sun, Nov 20, 2016 at 11:35 PM, Xingyu Na notifications@github.com
|
src/tree/context-dep.cc
Outdated
| std::vector<EventAnswerType> forward_pdfs, self_loop_pdfs; | ||
|
|
||
| int32 forward_pdf, self_loop_pdf; | ||
| if (Compute(phone_window, forward_pdf_class, &forward_pdf) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see what you meant, with this code here, that's what prevented it from crashing, and it could be right.
But still, I think it would be better if you did it as I said-- that would be clearer, and may prevent unnecessary recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and no need to re-test fully. It should lead to an identical transition-model, and you can do a 'diff' after rerunning the stage when you get the transition-model from the tree and the topology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I remember why it was written in this way. When 0 in included in the event vecter, MultiMap doesn't work correct. I try again and it is the same problem.
|
Got it. On 2016/11/21 12:52, Daniel Povey wrote:
|
|
Also, after that, as one final check before I merge, can you please run the RM setup but removing the left-biphone-related options? I want to make sure it doesn't crash for triphone. No need to check anything in related to that-- just run it and make sure there are no crashes. |
|
What happens specifically? On Mon, Nov 21, 2016 at 12:42 AM, Xingyu Na notifications@github.com
|
|
... BTW, you shouldn't include 0 for the P'th position, only for the other On Mon, Nov 21, 2016 at 12:49 AM, Daniel Povey dpovey@gmail.com wrote:
|
|
No I didn't include 0 for the P'th position. When I use a event vector On 2016/11/21 13:51, Daniel Povey wrote:
|
|
The RM setup is triphone. I've tested left-biphone system on swbd, as X. On 2016/11/21 13:27, Daniel Povey wrote:
|
365fea3 to
620e516
Compare
|
Merging. Thanks!! |
Place holder for addressing #1031 . WIP log: