-
Notifications
You must be signed in to change notification settings - Fork 5.4k
new tool fsts-concat, similar to fsts-union... #2562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I don't think I want to merge this with its current design.
in my opinion, which would require different code. |
|
Well, I was thinking of what you just proposed, but then I decided to use the original API that was used for 'fsts-union' (I understand the difference between union and concatenation, union is 'order invariant', concatenation isn't)... And then I realized it might be less flexible than the "current" solution in PR. Imagine a situation in which one would like to concatenate different number of FSTs for individual keys... This is easily doable with the scps, but not with the arks. The "current" version in this PR is easy to use by preparing 'scp' file from any script language. Typically, some fsts will "change" on per-key basis and some will be "fixed". To check this in, is it absolutely necessary to have some calling script? |
|
I think the version as you have it will create a lot of scripting headaches
in the most likely use-cases, if we commit it as-is, that's why I don't
plan to merge it with that command-line interface.
It's not 100% necessary to have a calling script.
…On Mon, Jul 23, 2018 at 3:02 AM, Karel Vesely ***@***.***> wrote:
Well, I was thinking of what you just proposed, but then I decided to use
the original API that was used for 'fsts-union' (I understand the
difference between union and concatenation)...
And then I realized it might be less flexible than the "current" solution.
Imagine a situation in which one would like to concatenate different number
of FSTs for individual keys... This is easily doable with the scps, but not
with the arks...
The "current" version is easy to use by preparing the 'scp' from any
script language. Typically, some fsts will be "change" on per-key basis and
some will be "fixed".
To check this in, is it absolutely necessary to have some calling script?
(I am developing this for a company, and I am not sure how this
intellectual property things work...)
K.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2562 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuw-fFbFvWlCiQsiRONvE8F63lJPdks5uJZ9MgaJpZM4VZi-x>
.
|
|
Okay, in this case, I will change the interface, and I'll send an update when it's done... Thank you, Karel. |
2b88902 to
d353043
Compare
|
The API was changed as Dan suggested. (and the commit is rebased on top of the 'main master'...) |
danpovey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. some very cosmetic comments.
src/fstbin/fsts-concat.cc
Outdated
| bool skip_key = false; | ||
| for (int32 i=0; i<fst_readers.size(); i++) { | ||
| if (!fst_readers[i]->HasKey(key)) { | ||
| KALDI_WARN << "Skippng '" << key << "'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo skipping
src/fstbin/fsts-concat.cc
Outdated
| for (int32 i=0; i<fst_readers.size(); i++) { | ||
| if (!fst_readers[i]->HasKey(key)) { | ||
| KALDI_WARN << "Skippng '" << key << "'" | ||
| << " due to missing the fst in " << i+2 << ". <rspecifier> : " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add parens around i+2, just to avoid doubt about precedence; and maybe replace . with 'th.
src/fstbin/fsts-concat.cc
Outdated
|
|
||
| // check that the key exists in all 'fst_readers', | ||
| bool skip_key = false; | ||
| for (int32 i=0; i<fst_readers.size(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some compilers warn about signed/unsigned comparisons... I think it would be helpful to have a variable
int32 fst_readers_size = (int32)fst_readers.size();
src/fstbin/fsts-concat.cc
Outdated
| n_done++; | ||
| } | ||
|
|
||
| // cleanup, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please start comments with capitals and end with period, not comma. It's google style guide... sorry to be picky.
src/fstbin/fsts-concat.cc
Outdated
| std::vector<RandomAccessTableReader<VectorFstHolder>*> fst_readers; | ||
| TableWriter<VectorFstHolder> fst_writer(fsts_wspecifier); | ||
|
|
||
| for (int32 i=2; i<po.NumArgs(); i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add space around operators = and < (also later on in this file).
d353043 to
63f0e57
Compare
- concatenates all vector-fsts with the same 'key' from the sorted 'rspecifier' in the order as they appear in I/O, - handy for concatenating HCLG graphs: 'phn-loop . some_word . phn-loop',
63f0e57 to
b11d6a8
Compare
|
okay, everything is incorporated... (and rebased...) |
'rspecifier' in the order as they appear in I/O,