Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to work with newer models #1

Open
jvkim opened this issue Dec 28, 2022 · 1 comment
Open

Update to work with newer models #1

jvkim opened this issue Dec 28, 2022 · 1 comment

Comments

@jvkim
Copy link

jvkim commented Dec 28, 2022

The kaldi site has two newer models available, the Librespeech and Gigaspeech models at https://www.kaldi-asr.org/models.html. I've been looking through the code at https://github.com/grwgreg/silviux/blob/main/server/silviux-server/lm-script/silviux.sh because that is where the model files are downloaded from the kaldi site but changing the url to the new models fails because files are missing or named differently. Are these new models at all compatible with this project? Thanks!

@grwgreg
Copy link
Owner

grwgreg commented Dec 30, 2022

There are two parts to the question of whether the new models are compatible with silviux. The first is whether a built model will work with the gstreamer server and the second part is whether the language model tools and scripts can be used to build new models. Regarding the first part, if the yaml config has the needed properties and the version of kaldi is up to date, I think it should work. The server is a fork of https://github.com/alumae/kaldi-gstreamer-server so I'd try to get it working there first.

As for the scripts for building models, the ones currently in the server/lm-script folder were made to work with the aspire chain model's build process. I remember in 2020 looking at the librespeech model and I was put off because it was using a program g2p to make the lexicon files. Supporting that would have required changes to the dockerfile as well as rewriting all the scripts I had working. But there is no reason it wouldn't work if you wanted to install g2p and play around with the scripts. If you look at the run.sh file in https://github.com/kaldi-asr/kaldi/tree/master/egs/gigaspeech/s5 you'll see some of the "stages" run commands related to the lexicon and dictionary. Basically those need to be run with the new dictionary and language model files located in the right place to be used as input. Then the final stages (the ones that run utils/mkgraph.sh) have to be run again to make the final model dir that will be used from the server.

I'll take a look at the Gigaspeech model later when I have more time to see how much work it would be to get everything updated. I'm also a little worried the kaldi-gstreamer-server looks like it hasn't been updated since 2020 but kaldi is still very active so maybe it's just stable and works with the newer models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants