Skip to content

Add util to compute recordings durations before data perturbation#2414

Merged
danpovey merged 2 commits intokaldi-asr:masterfrom
Ore-an:reco2dur
May 14, 2018
Merged

Add util to compute recordings durations before data perturbation#2414
danpovey merged 2 commits intokaldi-asr:masterfrom
Ore-an:reco2dur

Conversation

@Ore-an
Copy link
Contributor

@Ore-an Ore-an commented May 11, 2018

As per #2388. I'd like some help to test more extensively, especially in the case where recording-ids are the same as utt-ids, as a bug could have a big impact

fi

if [ -s $data/utt2dur ] && \
[ $(cat $data/utt2spk | wc -l) -eq $(cat $data/utt2dur | wc -l) ] && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could do
wc -l <$data/utt2spk to avoid the 'cat'.

@danpovey
Copy link
Contributor

Thanks! I commented on the original thread- I hope the guy who reported the problem can help test it.

@danpovey
Copy link
Contributor

Thanks. Have you at least tested that this runs? I may merge without waiting for that guy to test.

@Ore-an
Copy link
Contributor Author

Ore-an commented May 14, 2018

Yes, I've tested it on a dataset with a segments file and everything works as it should; durations are as reported by soxi, validate_data_dir outputs if there's missing lines in the file and fix_data_dir filters out what's not in wav.scp.

@danpovey danpovey merged commit ff0da26 into kaldi-asr:master May 14, 2018
@danpovey
Copy link
Contributor

Thanks for doing this! Merged.

danpovey added a commit to danpovey/kaldi that referenced this pull request May 21, 2018
dpriver pushed a commit to dpriver/kaldi that referenced this pull request Sep 13, 2018
Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants