Skip to content

Conversation

@dresen
Copy link
Contributor

@dresen dresen commented Dec 2, 2016

The update in this PR makes te modifications to sprakbanken that was requested for sprakbanken_swe, makes the python scripts work with python 2.7.x, simplifies the recipe and gives better results. Because I have changed the data preprocessing, a new lexicon needs to be uploaded to openslr, but I cannot attach it to the PR.

…prakbanken_swe and removed deprecated commands from run.sh
…h python 2 and 3 on the request of @jtrmal (I think they are slower now because we use more regexes). Changed the preprocessing so case is not normalised and altered default behaviour to delete sentence-final '.' rather than convert to a token because it is more often the case that they are not spoken aloud.
…ased systems. Changed the scoring scripts in local/ to be similar to WSJ to get better analyses and changed the local/wer* scripts to fit this recipe.
… but particular Danish characters. Corrected error in previous commit that changes openfst version tools/Makefile
@danpovey
Copy link
Contributor

danpovey commented Dec 2, 2016 via email

@dresen
Copy link
Contributor Author

dresen commented Dec 3, 2016

The words in the new lexicon are not case normalised. Otherwise, the old and new version are the same. I had thought to just replace the old lexicon with the new one, but if you would like to keep the old version, I can rename the new one to e.g. lexicon-da-nonorm.tgz

@danpovey
Copy link
Contributor

danpovey commented Dec 3, 2016 via email

@jtrmal
Copy link
Contributor

jtrmal commented Dec 5, 2016 via email

@danpovey
Copy link
Contributor

danpovey commented Dec 5, 2016 via email

@dresen
Copy link
Contributor Author

dresen commented Dec 6, 2016 via email

dictdir=data/local/dict
espeakdir='espeak-1.48.04-source'
mkdir -p $dir
mkdir -p $dictsrc $dictd ir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to be a space in the middle of a word.

@danpovey
Copy link
Contributor

There is a conflict, can you please merge and resolve?

@danpovey danpovey merged commit f6b82ad into kaldi-asr:master Dec 15, 2016
@dresen dresen deleted the swedish-changes branch December 15, 2016 07:57
dresen added a commit to dresen/kaldi that referenced this pull request Dec 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants