Skip to content

Conversation

@aevernon
Copy link
Contributor

The latest version of sw-ms98-dict.text from http://www.openslr.org/resources/5/switchboard_word_alignments.tar.gz contains the following header on line 1:

file: $SWB/data/dictionary/sw-ms98-dict.text

This ends up in lexicon0.txt, which causes utils/validate_dict_dir.pl to
fail with the following error:

--> ERROR: phone "$swb/data/dictionary/sw-ms98-dict.text" is not in {, non}silence.txt (line 10399)

This in turn causes utils/prepare_lang.sh to fail.

Update dict.patch so that it removes the file header from sw-ms98-dict.text.

The latest version of sw-ms98-dict.text from
http://www.openslr.org/resources/5/switchboard_word_alignments.tar.gz
contains the following header on line 1:

  file: $SWB/data/dictionary/sw-ms98-dict.text

This ends up in lexicon0.txt, which causes utils/validate_dict_dir.pl to
fail with the following error:

  --> ERROR: phone "$swb/data/dictionary/sw-ms98-dict.text" is not in {,
  non}silence.txt (line 10399)

This in turn causes utils/prepare_lang.sh to fail.

Update dict.patch so that it removes the file header from sw-ms98-dict.text.
@danpovey
Copy link
Contributor

That line has always been there; I think the regression was caused by some fixes to awk scripts that we made for mawk compatibility. But your fix works, so merging.

@danpovey danpovey merged commit e51a56a into kaldi-asr:master Oct 31, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants