Name		Name	Last commit message	Last commit date
parent directory ..
datasets		datasets
README.md		README.md
finetune_cli.py		finetune_cli.py
finetune_gradio.py		finetune_gradio.py
train.py		train.py

README.md

Training

Prepare Dataset

Example data processing scripts for Emilia and Wenetspeech4TTS, and you may tailor your own one along with a Dataset class in src/f5_tts/model/dataset.py.

1. Datasets used for pretrained models

Download corresponding dataset first, and fill in the path in scripts.

# Prepare the Emilia dataset
python src/f5_tts/train/datasets/prepare_emilia.py

# Prepare the Wenetspeech4TTS dataset
python src/f5_tts/train/datasets/prepare_wenetspeech4tts.py

2. Create custom dataset with metadata.csv

Use guidance see #57 here.

python src/f5_tts/train/datasets/prepare_csv_wavs.py

Training & Finetuning

Once your datasets are prepared, you can start the training process.

1. Training script used for pretrained model

# setup accelerate config, e.g. use multi-gpu ddp, fp16
# will be to: ~/.cache/huggingface/accelerate/default_config.yaml     
accelerate config
accelerate launch src/f5_tts/train/train.py

2. Finetuning practice

Discussion board for Finetuning #57.

Gradio UI training/finetuning with src/f5_tts/train/finetune_gradio.py see #143.

3. Wandb Logging

The wandb/ dir will be created under path you run training/finetuning scripts.

By default, the training script does NOT use logging (assuming you didn't manually log in using wandb login).

To turn on wandb logging, you can either:

Manually login with wandb login: Learn more here
Automatically login programmatically by setting an environment variable: Get an API KEY at https://wandb.ai/site/ and set the environment variable as follows:

On Mac & Linux:

export WANDB_API_KEY=<YOUR WANDB API KEY>

On Windows:

set WANDB_API_KEY=<YOUR WANDB API KEY>

Moreover, if you couldn't access Wandb and want to log metrics offline, you can the environment variable as follows:

export WANDB_MODE=offline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train

train

README.md

Training

Prepare Dataset

1. Datasets used for pretrained models

2. Create custom dataset with metadata.csv

Training & Finetuning

1. Training script used for pretrained model

2. Finetuning practice

3. Wandb Logging

Files

train

Directory actions

More options

Directory actions

More options

Latest commit

History

train

Folders and files

parent directory

README.md

Training

Prepare Dataset

1. Datasets used for pretrained models

2. Create custom dataset with metadata.csv

Training & Finetuning

1. Training script used for pretrained model

2. Finetuning practice

3. Wandb Logging