-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The mismatch between split and split-lazy, specifically counting from 0 or 1 #1152
Comments
Sorry for that, I think initially I wanted to be compatible with Kaldi's convention of 1-based splits for data directories, and later I didn't remember that when introducing the lazy thing. We should fix that. I probably prefer to adopt 0-based counting everywhere, seems more consistent with the rest of the and library and Python in general. WDYT @yfyeung @csukuangfj @desh2608 @danpovey? |
Thanks. I agree with adhering to a 0-based counting system. Best regards. |
Submitting jobs on an SGE cluster requires indices starting at 1, which I suppose is why 1-based indexing is used in Kaldi. I would suggest adding a "start_index" option to those functions, which defaults to 0. |
The above is also why we added the |
Thanks, should be fixed now. |
I note that
lhotse split
will generate splits that counting from 1, whilelhotse split-lazy
will generate splits that counting from 0.In some icefall recipes like gigaspeech, we initially use
lhotse split
and then change to uselhotse split-lazy
. However, we didn't consider this mismatch, resulting in some bugs.The text was updated successfully, but these errors were encountered: