Skip to content

mseri/kokoro-reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Kokoro Reader

A simple text-to-speech CLI tool using the Kokoro speech synthesis engine.

Note: This repository also includes an experimental version (chatterbox-reader.py) in teh #chatterbox branch, using streaming version of Chatterbox instead of kokoro. However, this version is currently very slow and not suitable for practical use at the moment. The #moshi branch contains a version using Moshi TTS, even more experimental and not recommended for use.

Features

  • High-quality text-to-speech synthesis
  • Multiple voice options with different languages and genders
  • Automatic model download and caching
  • Interactive mode with command history
  • Speed control
  • Support for reading from files, URLs, or standard input

Installation

  1. Install uv for Python package management.
  2. Clone or download this repository.
git clone https://github.com/yourusername/kokoro-reader.git
cd kokoro-reader

Usage

Basic Usage

uv run kokoro-reader.py [options]

The required model files will be automatically downloaded to ~/.cache/kokoro-reader on first run.

Options

Option Description
-f, --file FILE Input text file
-u, --url URL URL to extract text from
-v, --voice VOICE Voice to use (default: af_bella)
-s, --speed SPEED Speech speed (default: 0.8)
-l, --lang LANG Language (default: en-us)
-i, --interactive Run in interactive mode

Note: If neither -f/--file nor -u/--url is provided, text is read from standard input (stdin).

Examples

Read text from a file:

uv run kokoro-reader.py -f mytext.txt

Read text from a URL (extracts main content from web pages):

uv run kokoro-reader.py -u https://example.com/article

Use a specific voice:

uv run kokoro-reader.py -v bf_emma -f mytext.txt

Adjust speech speed:

uv run kokoro-reader.py -s 1.2 -f mytext.txt

Read from stdin:

echo "Hello, world!" | uv run kokoro-reader.py

Interactive Mode

Run in interactive mode:

uv run kokoro-reader.py -i

In interactive mode, you can:

  • Enter text directly (end with /EOT)
  • Read text from files or URLs without exiting the application
  • Use arrow keys to navigate input history
  • Edit input with left/right arrow keys
  • Use commands to change voice, language and speed settings

Interactive Commands

Command Description
TEXT Enter text directly (must end with /EOT)
/f PATH Read text from file
/u URL Read text from URL
/v VOICE Change voice
/v? Show available voices with grade C or better
/l LANG Change language
/s SPEED Change speed
/q Quit

Available Voices

You can view all available high-quality voices by using the /v? command in interactive mode or checking the table below. The list includes American English, British English, and Italian voices, organized by gender and sorted by quality.

For additional languages and voice options, see the official documentation: https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md

Voice Quality Grades

Voices are graded from A (best) to F (worst). Only voices with grade C or better are recommended.

Best Voices by Language

American English (en-us)

Voice Name Gender Grade Description
af_heart Female A Best overall voice quality
af_bella Female A- Default voice, excellent quality
af_nicole Female B- Good quality
am_fenrir Male C+ Best male voice for American English

British English (en-gb)

Voice Name Gender Grade
bf_emma Female B-
bf_isabella Female C
bm_fable Male C

Italian (it)

Voice Name Gender Grade
if_sara Female C
im_nicola Male C

Voice Naming Convention

Voice names follow a specific naming convention:

  • First letter: language (a=American, b=British, i=Italian)
  • Second letter: gender (f=female, m=male)
  • Followed by underscore and a name (e.g., af_bella = American Female Bella)

For complete voice listings and documentation, see https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md

Dependencies

  • Python 3.12 or higher
  • kokoro-onnx - Core speech synthesis library
  • sounddevice - Audio output
  • requests - Network requests for downloading models
  • tqdm - Progress bars for downloads
  • prompt_toolkit - Interactive shell interface
  • trafilatura - Web content extraction for URL reading

Alternative Implementations

Experimental Chatterbox Version

The repository includes chatterbox-reader.py, which is an experimental implementation using the streaming version of Chatterbox (https://github.com/davidbrowne17/chatterbox-streaming). This version offers:

  • Different voice generation approach based on Chatterbox-streaming
  • Emotion exaggeration controls
  • Similar interface to the main Kokoro reader

However, be aware that:

  • It is significantly slower than the main Kokoro implementation
  • Currently not suitable for practical daily use
  • Limited voice options (no voice selection parameter)

To try the experimental version:

uv run chatterbox-reader.py [options]

Credits

About

Rudimentary kokoro script to read out text

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages