Mimic Recording Studio

Mimic Recording Studio
Recording Tips
Providing your recording to Mycroft for training
Contributions
Where to get support and assistance

The Mycroft open source Mimic technologies are Text-to-Speech engines which take a piece of written text and convert it into spoken audio. The latest generation of this technology, Mimic 2, uses machine learning techniques to create a model which can speak a specific language, sounding like the voice on which it was trained.

The Mimic Recording Studio simplifies the collection of training data from individuals, each of which can be used to produce a distinct voice for Mimic.

Software Quick Start

Windows self-hosted Quick Start

git clone https://github.com/MycroftAI/mimic-recording-studio.git
cd mimic-recording-studio
start-windows.bat

Linux/Mac self-hosted Quick Start

Install Dependencies

Docker (community edition is fine)
Docker Compose

Why docker? To make this super easy to set up and run cross platforms.

Build and Run

git clone https://github.com/MycroftAI/mimic-recording-studio.git
cd mimic-recording-studio
docker-compose up to build and run (Note: You may need to use sudo docker-compose up depending on your distribution)

Alternatively, you can build and run separately. docker-compose build then docker-compose up
In your browser, go to http://localhost:3000

Note: The first execution of docker-compose up will take a while as this command will also build the docker containers. Subsequent executions of docker-compose up should be quicker to boot.

Manual Install, Build and Start

Backend

Dependencies

python 3.5 +
ffmpeg

Build & Run

cd backend/
pip install -r requirements.txt
python run.py

Frontend

Dependencies

node & npm
create-react-app
yarn - optional for faster build, install, and start

Build & Run

cd frontend/
npm install, alternatively yarn install
npm start, alternatively yarn start

Coming soon!

Online, http://mimic.mycroft.ai hosted version requiring zero setup.

Data

Audio Recordings

WAV files

Audio is saved as WAV files to the backend/audio_file/{uuid}/ directory. The backend automatically trims the beginning and ending silence for all WAV files using ffmpeg.

{uuid}-metadata.txt

Metadata is also saved to backend/audio_file/{uuid}/. This file maps the WAV file name to the phrase spoken. This along with the WAV files are what you needed to get started on training Mimic 2.

Corpus

For now, we have an English corpus, english_corpus.csv made available which can be found in backend/prompt/. To use your own corpus follow these steps.

Create a csv file in the same format as english_corpus.csv using tabs (\t) as the delimiter.
Make sure there are no empty lines in the corpus
Add your corpus to the backend/prompt directory.
Change the CORPUS environment variable in docker-compose.yml to your corpus name.

Corpora in other languages

If you wish to develop a corpus in a language other than English, then Mimic Recording Studio can be used to produce voice recordings for TTS voices in additional languages. If you are building a corpus in a language other than English, we encourage you to choose phrases which:

occur in natural, everyday speech in the target language
have a variety of string lengths
cover a wide variety of phonemes (basic sounds)

IMPORTANT: For now, you must reset the sqlite database to use a new corpus. If you've recorded on another corpus and would like to save that data, you can simply rename your sqlite db found in backend/db/ to another name. The backend will detect that mimicstudio.db is not there and create a new one for you. You may continue recording data for your new corpus.

Technologies

Frontend

The web UI is built using Javascript and React and create-react-app as a scaffolding tool. Refer to CRA.md to find out more on how to use create-react-app.

Functions

Record and play audio
Generate audio visualization
Calculate and display metrics

Backend

The web service is built using Python, Flask as the backend framework, gunicorn as a http webserver, and sqlite as the database.

Functions

Process audio
Serves corpus and metrics data
Record info in database
Record data to the file system

Docker

Docker is used to containerize both applications. By default, the frontend uses network port 3000 while the backend uses networking port 5000. You can configure these in the docker-compose.yml file.

NOTE: If you are running docker-registry, this runs by default on port 5000, so you will need to change which port you use.

Recording Tips

Creating a voice requires an achievable, but significant effort. An individual will need to record 15,000 - 20,000 phrases. In order to get the best possible Mimic voice, the recordings need to be clean and consistent. To that end, follow these recommendations:

Record in a quiet environment with noise-dampening material. If your ears can hear outside noise, so can the microphone. For best results, even the sound of air conditioning blowing through a vent should be avoided. Bare walls create subtle echoes and reverberation. A sound dampening booth is ideal, but you can also create a homemade recording studio using soft materials such as acoustic foam in a closet. Comforters and mattresses can also be used effectively!
Speak at a consistent volume and speed. Rushing through the phrases will only result in a lower quality voice.
Use a quality microphone. To obtain consistent results, we recommend a headset microphone so your mouth is always the same distance from the mic.
Avoid vocal fatigue. Record a maximum of 4 hours a day, taking a break every half hour.

Providing your recording to Mycroft for training

We welcome your voice donations to Mycroft for use in Text-to-Speech applications. If you would like to provide your voice recordings, you must license them to us under the Creative Commons CC0 Public Domain license so that we can utilise them in TTS voices - which are derivative works. If you're ready to donate your voice recordings, email us at [email protected].

Contributions

PR's are gladly accepted!

Where to get support and assistance

You can get help and support with Mimic Recording Studio at;

The Mycroft Forum
In Mycroft Chat

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
demo.gif		demo.gif
docker-compose.yml		docker-compose.yml
start-windows.bat		start-windows.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mimic Recording Studio

Software Quick Start

Windows self-hosted Quick Start

Linux/Mac self-hosted Quick Start

Install Dependencies

Build and Run

Manual Install, Build and Start

Backend

Dependencies

Build & Run

Frontend

Dependencies

Build & Run

Coming soon!

Data

Audio Recordings

WAV files

{uuid}-metadata.txt

Corpus

Corpora in other languages

Technologies

Frontend

Functions

Backend

Functions

Docker

Recording Tips

Providing your recording to Mycroft for training

Contributions

Where to get support and assistance

About

Releases

Packages

Languages

License

benjamin3322/mimic-recording-studio

Folders and files

Latest commit

History

Repository files navigation

Mimic Recording Studio

Software Quick Start

Windows self-hosted Quick Start

Linux/Mac self-hosted Quick Start

Install Dependencies

Build and Run

Manual Install, Build and Start

Backend

Dependencies

Build & Run

Frontend

Dependencies

Build & Run

Coming soon!

Data

Audio Recordings

WAV files

{uuid}-metadata.txt

Corpus

Corpora in other languages

Technologies

Frontend

Functions

Backend

Functions

Docker

Recording Tips

Providing your recording to Mycroft for training

Contributions

Where to get support and assistance

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages