Skip to content

Remove transcription mode #219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 5, 2014

Conversation

Holzhaus
Copy link
Member

@Holzhaus Holzhaus commented Oct 9, 2014

If you look at jasper.py and client/mic.py, we're using the Mic class like this:

class Mic:
    def__init__(self, speaker, passive_stt_engine, active_stt_engine):

So we pass in two different STT Engine instances:

  1. An STT Engine instance for passive listen
  2. An STT Engine Instance for active listen

But PocketsphinxSTT takes up to 3 lm/dict pairs:

  1. A pair for passive listen
  2. A pair for active listen
    3 A pair for active listen (musicmode)

This is a lot duplication, because in jasper.py, we're creating two separate SST Engine Instances for active listen and passive listen, so that the active listen STT Engine instance will only use the second lm/dict pair and the passive listen STT instance will only use the first pair.
In MusicMode, a third STT instance will be created that only uses the third pair.

Given the case that someone wants to write a new module that also has the ability to start a mode like the MusicMode, he'd either have to hijack the music lm/dict pair, or he'd need to add a new mode to STT Engines and change the PocketsphinxSTTEngine code.

Thus, I'd like to simplify the STT engine dramatically by removing two of the three dict pairs (or rather Vocabulary Instances) from the PocketsphinxSTT engine and the mode parameter in transcribe accordingly.

Also, there's no need to case about custom engine-specific settings, because the get_engine() classmethod takes care of that for you.

This vastly simplifies how STT engines are instantiated. Basically, you only need these steps:

import stt

engine = stt.get_engine_by_slug('sphinx')
# Alternative
engine = stt.get_engine_by_slug('google')

# Convenience method: Instance for passive listen
stt_instance_passive = engine.get_passive_instance() 
# Convenience method: Instance for active listen
stt_instance_active = engine.get_active_instance()

# Generic method (e.g. used by MusicMode (MPDControl.py)
stt_instance_custom = engine.get_instance('my_vocab', ['SOME', 'PHRASES', 'TO', 'RECOGNIZE'])

# Now create a mic:
mic1 = Mic(tts_instance, stt_instance_passive, stt_instance_active)

# A module like musicmode now can simply do this:
mic2 = Mic(mic1.speaker, mic.passive_stt_engine, stt_instance_custom)

So what happens inside get_instance() (which is also called by the convenience methods get_passive_instance()/get_active_instance())?

  • the configuration for this STT engine is retrieved
  • if this STT engine needs a vocabulary ('sphinx' does, but 'google' does not):
    • a vocabulary will be initialized with the name my_vocab.
    • if this vocabulary does not exist yet or if it doesn't match all phrases, it'll be (re)compiled
  • an STT engine instance is created with config (and vocabulary if neccessary) and returned

Because every STT engine only has one vocabulary, the transcribe() method becomes less complex, because it doesn't need the mode argument anymore.

@Holzhaus Holzhaus force-pushed the remove-transcription-mode branch from d39ccbe to 27f2cb5 Compare October 13, 2014 12:06
@Holzhaus Holzhaus force-pushed the remove-transcription-mode branch from 27f2cb5 to 7155c0e Compare October 13, 2014 12:08
@coveralls
Copy link

Coverage Status

Coverage increased (+2.19%) when pulling 7155c0e on Holzhaus:remove-transcription-mode into 32218ec on jasperproject:master.

@Holzhaus
Copy link
Member Author

@crm416 @shbhrsaha Can you review this, please?

@shbhrsaha
Copy link
Member

Looks slick. I'll give this a test this weekend and report back!

update: haven't forgotten about this! will post soon

Holzhaus added a commit to Holzhaus/jasper-client that referenced this pull request Oct 20, 2014
@shbhrsaha
Copy link
Member

This works great! Thank you for putting up with the wait. Should be good for merge.

@Holzhaus Holzhaus force-pushed the remove-transcription-mode branch from 7155c0e to 8bac1e6 Compare November 5, 2014 13:37
@coveralls
Copy link

Coverage Status

Coverage increased (+1.93%) when pulling 8bac1e6 on Holzhaus:remove-transcription-mode into c44b772 on jasperproject:master.

Holzhaus added a commit that referenced this pull request Nov 5, 2014
@Holzhaus Holzhaus merged commit 56f4433 into jasperproject:master Nov 5, 2014
@Holzhaus Holzhaus deleted the remove-transcription-mode branch December 2, 2014 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants