❗❗❗ This repo is now deprecated. Check the Vonage Developer Blog for more blog posts and tutorials. For more sample Vonage projects, check the Vonage Community GitHub repo.
This is a Basic python tornado app for handling websocket audio from the Nexmo Voice API.
This is an ideal starting point for interfacing between Nexmo and an AI Bot platform.
- Generates an NCCO to be returned which will connect a call to the /socket endpoint of the server with the CLI in a meta data field.
- Exposes a websocket server to receive the connection from Nexmo
- Maintains a dict. of each connection object referenced by its CLI
- Breaks the stream of incomming audio into descreet buffered objects using Voice Acivity Detection to separate block of speech
- Provides a function to playback audio to a caller by CLI
- Provides a handler to print events to the console
You'll need Python 2.7, and we recommend you install the code inside a python virtualenv. You may also need header files for Python and OpenSSL, depending on your operating system. The instructions below are for Ubuntu 14.04.
sudo apt-get install -y python-pip python-dev libssl-dev
pip install --upgrade virtualenv
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
The application reads its configuration from an app.conf file in the root directory, a template file is included in the repo, rename this to app.conf
and enter the approriate values
The port
configuration variable is only required if you don't want to host on
port 8000. If you're running ngrok and you're not using port 8000 for anything
else, just run ngrok http 8000
to tunnel to your Audiosocket service.
This is where the framework application will store audio files it records from the websocket it is only needed for the basic demo
Your audiosocket server needs to be available on a publicly hosted server. If you want to run it locally, we recommend running ngrok to create a publicly addressable tunnel to your computer.
Whatever public hostname you have, you should enter it into app.conf
.
You'll also need to know this hostname for the next step, creating a Nexmo
application.
Use the Nexmo command-line tool to create
a new application and associate it with your server (substitute YOUR-HOSTNAME
with the hostname you've put in your HOST
config file):
nexmo app:create "Audiosocket Demo" "https://YOUR-HOSTNAME/ncco" "https://YOUR-HOSTNAME/event"
If it's successful, the nexmo
tool will print out the new app ID and a
private key. Put these, respectively in config/APP_ID
and
config/PRIVATE_KEY
.
If you need to, find and buy a number:
# Skip the first 2 steps if you already have a Nexmo number to use.
# Replace GB with your country-code:
nexmo number:search GB —voice
# Find a number you like, then buy it:
nexmo number:buy [NUMBER]
# Associate the number with your app-id:
nexmo link:app [NUMBER] [APPID]
Now you can start the audiosocket service with:
./venv/bin/python server.py
If you want to see more verbose logging messsages add a -v
flag to the startup to see all DEBUG level messages
This framework is meant to be a starting point for integrating whatever voice processing solution you desire, within the Processor
class there is a function named process
modify this to do whatever you want to with the blocks of speech, for example posting them to a transcription API, the current code jsut saves them to wav files which is useful for debuggin but you do not have to write to the filesystem if you don't need to store the audio.
Configuration for the processor for example API keys for 3rd party services can be passed in when the processor
object is created at line 219, you will see in the demo that a path object is passed in to tell the processor where to save files. Again this can be removed if not required.
You can also playback audio responses to the caller using the playback
funciton of the Processor
this uses the CLI of the caller as an identifier to ensure it is played to the correct connection therefore you need to track this in your requests and pass it in as a parameter along with wav or raw audio content
in the response.