Skip to content

Latest commit

 

History

History
53 lines (33 loc) · 1.85 KB

README.md

File metadata and controls

53 lines (33 loc) · 1.85 KB

Speech to Text Python command

Simple command line tool to create text transcripts out of audio files using IBM Watson Speech to Text.

Install

Using PyPi is the easiest way:

$ pip install speech-to-text

Or installing the dev version:

$ git clone https://github.com/rmotr/speech-to-text
$ mkvirtualenv speech-to-text
$ pip install -r requirements.txt

Usage

The first thing you'll need to do is get your Bluemix Username and Password. This is a tedious process, if you have issues, we've written a blog post that describes how to do it. Once you have your username and password you can do:

$ speech_to_text -u <MY-USERNAME> -p <MY-PASSWORD> -f html -i <AUDIO-FILE> transcript.html

(You can omit the password option and you'll be prompted to type it in a secure manner.)

The -i option receives the audio file that you want to transcript, and it'll store the text in transcript.html in HTML format. To select a different format, see below..

Formatters

There are currently 4 formatters builtin: html (default), markdown, json, original. You can pass the -f option with any of those formatters in place.

Examples

Under the examples/ directory you can find a short audio file containing the first 30 seconds of Jacob Kaplan-Moss Keynote from Pycon 2015. There are also the end results of the transcription (html and markdown format).

Watson Documentation

https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/#recognize_sessionless_nonmp12

Reference

Audio File types supported:

  • audio/flac
  • audio/l16
  • audio/wav
  • audio/ogg;codecs=opus
  • audio/mulaw