Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin System #130

Closed
Holzhaus opened this issue Aug 12, 2014 · 8 comments
Closed

Plugin System #130

Holzhaus opened this issue Aug 12, 2014 · 8 comments
Milestone

Comments

@Holzhaus
Copy link
Member

Jasper should use some kind of plugin architecture (e.g. Yapsy), so that it's easy to activate, deactive and add modules.

The Plugin interface class could also define some abstract methods (via the @abstractmethod decorator), to tell Plugin developers, which methods are needed for a working plugin.

This would also make it easier for the main code to access available/activated plugins (e.g. in brain.py or vocabcompiler.py), especially if we want to load plugins from dífferent locations (the jasper base plugins from the installation directory and user-installed plugins in ~/.jasper-client/plugins/).

What do you think?

@charliermarsh
Copy link

Never seen Yapsy before--thanks for point it out! I like this idea. Unclear to me how much dev work it would require (probably not a ton), but it seems like we could plug this in fairly easily. Maybe the majority of the effort would need to be in re-documenting everything?

@Holzhaus
Copy link
Member Author

Well, it depends. To start, we could just a PluginManager class and a Plugin class that responds to transcribed speech (named SpeechHandlerPlugin or something like that) and which will act as base class that all current plugins inhertit from.

Later, other types of plugins could be added, e.g.:

  • EventHandlerPlugins, that react to events (like specific date/time, incoming emails, incoming asterisk calls, start/stop of XBMC playblack, etc) by doing something (like mic.say("It's 5 o'clock.") or pausing micListening during xbmc playback, or whatever)
  • TTSPlugins for espeak, festival, pico, Google, Cepstral, whatever (so that the user can simply write a plugin for his favourite engine to add support to jasper)
  • STTPlugins like pocketsphinx, Google STT, Julius/HTK, DragonFly, Windows Builtin, etc.
  • ....

@Holzhaus
Copy link
Member Author

General thoughts

I'm a big fan of modularity and customizability, so I think Jasper should only be the core software that connects the stuff together. Everything else should be put into plugins (which does not mean that we can't put a lot of useful plugins into the standard distribution).

So, there should be a number of different plugin types:

  1. STT plugins
  2. TTS plugins
  3. InputHandler plugins
  4. EventHandler plugins
  5. Input plugins (?)
  6. Output plugins (?)

Every plugin should have a '[Dependencies]' section in its description file that lists the depencies (obviously).

This way, we can keep jasper's core depencies very small (right now we already need a bunch of pypi packages that are required by modules that the user might not want to use at all).

Example

The Google STT/TTS services do not need anything except a working internet connection and an API key, so if you use that, you don't have to install anything else.
On the other hand, if you want to use pocketsphinx, you'll need to install pocketsphinx, openfst, mitlm, m2m-aligner, phonetisaurus and g014b2b first, before the Dependencies of the pocketsphinx plugin will be fullfilled.

Also, this gives us the possibility to make jasper more platform independent (If you use the Google TTS, Dragon Naturally Speaking or Windows Speech Recognition instead of Pocketsphinx, Jasper could easily run on Windows).

Plugin types

1. Speech-to-Text (STT) plugins

These are the plugin that do the transcription. There can be various ones, e.g.:

  • Pocketsphinx plugin
  • Google STT plugin
  • Julius plugin
  • sphinx4 plugin (?)
  • Nuance Dragon Naturally Speaking plugin (?)
  • Microsoft Windows Speech Recognition plugin (?)
  • ...

They inhertit from a some kind of abstract class (e.g. "AbstractSTTPlugin"), that acts as an interface and requires the subclasses to implement a transcribe() method.

2. Text-to-Speech (TTS) plugins

These to the opposite: they take some text and transform it into speech, e.g.:
- espeak plugin
- pico plugin
- Google TTS plugin
- festival plugin (?)
- Cepstral plugin (?)
- ...

They inhertit from a some kind of abstract class (e.g. "AbstractTTSPlugin"), that acts as an interface and requires the subclasses to implement a synthesize() method.

3. InputHandler plugins

These are plugins like most modules in client/modules. The receive some text, parse it and then either do something or yield some output or both.
The output will be delegated to some kind of output handler (e.g. Speaker/TTS) (see below)

4. EventHandler plugin

These plugins Handle events instead of text (i.e. transcribed speech) input. Event could be:

  • Gmail account received new E-Mails
  • It's twelve o'clock
  • Some command was issued by the JSON-RPC API
  • lircd received a button press from an infrared remote
  • FM receiver connected via GPIO received a signal
  • ....

They either do something instantly (like sending you an SMS) or put some phrase into the input queue so that it can be handled by one of the InputHandler plugins.
Examples:

  • Every day at 12 o'clock, the TimeEventHandler plugin puts the phrase "What time is it?" into the input queue, so that Jasper will evaluate this phrase and say "It's Twelve O'Clock".

5./6. I/O plugins

I'm not so sure about these (IMHO they can be postponed to a later version or left out completely, if you think that these go beyond the scope of Jasper). The basic idea is that they control how text is fed into Jasper and how it's put out. Basically, Mic will be an input plugin and (Loud)Speaker will be an output plugin, but there's room for more:

  • A text-based-Plugin (like we are doing with the local_mic module)
  • JSON-RPC/XML-RPC (Issue receive commands from other devices #143)
  • Listening for commands and answering with output via XMPP/Jabber (Input & Output)
  • Some kind of smartphone app that transcribes on the phone, sends the text to jasper, receives it's output and synthesizes it to speech back on the phone.

To archieve this, we could stop passing around input phrases as string but rather as InputPhrase objects that have an output_to attribute which tells jasper how output should be handled (e.g. read the output via a TTS plugin on the speakers or return it as text via Jabber).

@Holzhaus
Copy link
Member Author

Holzhaus commented Sep 1, 2014

An example implementation for the STT stuff can be found here.

[Disclaimer: Couldn't test it.]

@Holzhaus Holzhaus modified the milestone: v2.0 Sep 11, 2014
@Holzhaus Holzhaus removed the v2.0 label Sep 11, 2014
@androbwebb
Copy link

Would be nice to handle priorities on the modules through the plugin manager too.

@androbwebb
Copy link

Also, events should have configurable "Do Not Disturb" hours.

@Holzhaus
Copy link
Member Author

Superseded by issue #280.

@Holzhaus Holzhaus mentioned this issue Oct 15, 2015
5 tasks
@GustavoMSevero
Copy link

Guys, do you know Chatterbot? DO you know if there is a way to implement chatterbot in Jasper? Or do you know how to teach Jasper new dialogs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants