Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major code cleanup #124

Closed
wants to merge 31 commits into from
Closed

Major code cleanup #124

wants to merge 31 commits into from

Conversation

Holzhaus
Copy link
Member

@Holzhaus Holzhaus commented Aug 4, 2014

This fixes #123 and also fixes #119. It partly resolves #51.

You now should use "client.jasperpath"-module for all path related stuff (instead of hardcoding the strings). There's room for improvement (e.g. put the paths into a config file), but anyhow, this should do it for the moment.

Also, you really need to use the "tempfile"-module, which will create temporary files in the right location (and deletes them afterwards).

Last but not Least: I really don't get all this crazy sys.path-manipulation...what's the deal with that? I removed it for now.

I'll hope you can use these changes.

Fun fact: I did this to make it possible to create a ArchLinux PKGBUILD.

This is a rather big change. I replaced hardcoded filenames for tempfiles, that were written inside the cwd by using the tempfile module.

Hardcoded paths (most of them, I assume) were replaced by the brand new and shiny jasperpath module.

Also, some minor bugs were fixed and manipulation of sys.path has been removed.
@walchko
Copy link

walchko commented Aug 6, 2014

Great job!

PHONETISAURUS_PATH + "/phonetisaurus/dictionary.dic")
sentences_path = jasperpath.config('sentences.txt')
dictionary_path = jasperpath.config('dictionary.dic')
translateFile(sentences_path,dictionary_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you put a space between the arguments?

@charliermarsh
Copy link

This is great! Thanks a ton for doing this--really appreciate it. Most of my comments are just style nits (sorry to be picky). @shbhrsaha is going to test this on a device in the near future (I don't have one handy at the moment); hopefully we can get this merged in soon.

@Holzhaus
Copy link
Member Author

Holzhaus commented Aug 7, 2014

Currently no need to test this. I'm still making improvements and found some minor bugs in my code. I'll let you know as soon as it's finished ;-)

@Holzhaus
Copy link
Member Author

Right now, I think that some of the above bugs should be handled differently:

g014b2b is part of:
Precompiled train/test distribution for the unaccented CMUdict g014b2b distribution. Performance on this is %75.5 using the basic decoder. We basically just want to use its FST model with phonetisaurus-g2p.

So why not put the file path of the FST model into the config file? We could create some kind of "advanced settings" namespace, so that the user can use his or her own FST model to improve recognition rate, if he wants to.
The same thing goes for hub4wsj_sc_8k.

The config file contains something something like:

advanced_settings:
  fst: '/home/pi/phonetisaurus/g014b2b.fst'
  hmm: '/usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k'

If jasper-client is packed for other distributions, the package mantainer can just change the default paths inside the config path template via sed.

@shbhrsaha
Copy link
Member

Ah, good solution. Yes, and we can specify in the docs for Jasper disk image users to configure their setup accordingly.

I'll be able to test this again this weekend. Will post an update then.

@the01 the01 mentioned this pull request Aug 29, 2014
Removed the HMM_PATH and PHONETISAURUS_MODEL stuff from the jasperpath module. Added default values for them to stt/g2p and made them overwritable by config
@Holzhaus
Copy link
Member Author

Holzhaus commented Sep 1, 2014

@shbhrsaha Problem solved. We just check if we have the advanced_settings in the profile. If so, we overwrite the hardcoded defaults. This should work for the moment (until we have the config module).

@shbhrsaha
Copy link
Member

Great, thanks for putting that in.

I gave it another shot this weekend. The problem that surfaces now is that, on the Pi, the espeak voice is really slow and choppy. In addition, Jasper isn't responding to microphone input. I imagine the first might have to do with overwhelming the Pi's resources-- I'll try profiling tomorrow to try to share more details

As for mic input, @Holzhaus , how has the experience been running this refactored activeListen on your computer? Does it work smoothly? I've ruled out mic levels as a factor because the master version appears to be responding fine on the same settings

@Holzhaus
Copy link
Member Author

Holzhaus commented Sep 2, 2014

Well, that espeak issue might be caused by either the Rpi hardware being too slow to output something via PyAudio instead of aplay (you should check the CPU via top) OR It might somehow have to do with the samplerate not being supported by the card. I did not have the problem because I'm using either PulseAudio or type plug as my default card in /etc/asound.conf (which I believe will upsample the sounds to 44100 Hz). I'll investigate.

Regarding the Mic module: Does passiveListen() work fine and only the activeListen() part after the beep is flawed?

@Holzhaus
Copy link
Member Author

Holzhaus commented Sep 2, 2014

Ok, just tested it again. Even tough it takes ages (>20 attempts) until pocketsphinx recognizes 'Jasper' - if it does, it works fine for me.

@shbhrsaha
Copy link
Member

Got it, thanks for checking. Yes, that passiveListen is the troubling one because I too require many, many attempts to get it to trigger, while master works on the first 2-3 tries on same setup. Possible discrepancy between the two passiveListen implementations, will look into it

@Holzhaus Holzhaus mentioned this pull request Sep 3, 2014
@Holzhaus
Copy link
Member Author

Holzhaus commented Sep 5, 2014

Strange . I thought that bad recognition rate was caused by thr lack of good opensource acoustic models. But if the master branch performs better, something has to be rotten in the state of mic.py.

Possible way to check this might be to keep the wave files to analyze them in audacity.

@charliermarsh
Copy link

Any way we can carve out some of the smaller bugfixes here and merge them in? Perhaps:

  • Shebangs
  • Revised dictionary arguments
  • Anything else that's small and closes a few issues

@Holzhaus
Copy link
Member Author

Holzhaus commented Sep 5, 2014

The Problem is, that sometimes my commits are not just addressing one issue. They should, and now I know why. ;-)

  • You could tell me which commits to git cherry-pick and I'll create a new branch and pull request OR
  • I'll start a new branch from commit 1e13069 (the commit directly before mic.py has been changed) and...
    1. I add some code to detect and move existing config files to the config folder
    2. I add the boot-folder and start.sh again as wrapper files that just call jasper-client. This way, we keep the clean code, but the current user base will still be able to run jasper without changing their crontabs. In v2, we'll just have to delete the boot/-folder and start.sh again.
    3. We can also cherry-pick some later commits, if we want to.

@charliermarsh
Copy link

Yeah, this is getting messy. I suppose if we're just trying to fix this one recognition issue in mic.py then we might as well wait and merge it all together--it'll probably be less work, in the end.

Either way, we should keep start.sh as a wrapper file for backwards compatibility, as this is all part of v1.0.

@charliermarsh
Copy link

@Holzhaus To separate the low-hanging fruit, and if it's not too much work, each of these could be a separate pull request:

  1. Add shebangs (and close shebang is missing? #152)
  2. Fix dictionary arguments (and close Wrong dictionary used in mode music #133)
  3. Transcription in a separate thread / close PyAudio stream (and close PyAudio stream not closed #33)
  4. Add new speakers
  5. jasperpath
  6. ... Everything else?

I'm can work on breaking this up myself, but thought you might want to send them in as you wrote the original code.

@shbhrsaha
Copy link
Member

As an aside: the passiveListen in this implementation actually does seem to trigger at the right times, but with a ~10-second delay. Dropping the master's mic.py into this one makes that delay shorter, but the program is still generally sluggish (probably because of Pi resource limitations).

This suggests the tunable parameters are probably correct in passiveListen, but maybe we're running up against resource constraints. Let's go ahead with Charlie's suggestion to break this PR into those smaller ones that can be merged in more easily, while keeping a close eye on program performance when testing each PR. Might help to narrow down the cause of the sluggishness.

@alexsiri7
Copy link
Contributor

I agree with the separate-pull-requests idea. I think there are some great ones here, but as they are all together, it's harder to review and merge them afterwards. And a big PR needs to be kept up-to-date while things in the codebase change, smaller ones are easier to maintain, and get merged faster.

@Holzhaus Holzhaus deleted the master branch October 13, 2014 18:06
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants