Skip to content

Commit

Permalink
Train 'AISHELL-3' dataset with multi-speakers
Browse files Browse the repository at this point in the history
Signed-off-by: Robin Dong <[email protected]>
  • Loading branch information
RobinDong committed Sep 21, 2023
1 parent 345312d commit 8da3699
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 3 deletions.
6 changes: 4 additions & 2 deletions examples/tts/conf/zh/fastpitch_align_22050.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ name: FastPitch
train_dataset: ???
validation_datasets: ???
sup_data_path: ???
sup_data_types: [ "align_prior_matrix", "pitch" ]
sup_data_types: [ "align_prior_matrix", "pitch", "speaker_id"]

# Default values from librosa.pyin
pitch_fmin: 65.40639132514966
Expand Down Expand Up @@ -40,10 +40,12 @@ model:
learn_alignment: true
bin_loss_warmup_epochs: 100

n_speakers: 1
n_speakers: 1958
max_token_duration: 75
symbols_embedding_dim: 384
pitch_embedding_kernel_size: 3
speaker_emb_condition_prosody: true
speaker_emb_condition_aligner: true

pitch_fmin: ${pitch_fmin}
pitch_fmax: ${pitch_fmax}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: "ds_for_fastpitch_align"

manifest_filepath: "train_manifest.json"
sup_data_path: "sup_data"
sup_data_types: [ "align_prior_matrix", "pitch" ]
sup_data_types: [ "align_prior_matrix", "pitch", "speaker_id"]
phoneme_dict_path: "scripts/tts_dataset_files/zh/24finals/pinyin_dict_nv_22.10.txt"

dataset:
Expand Down
1 change: 1 addition & 0 deletions scripts/dataset_processing/tts/aishell3/get_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ def __process_transcript(file_path: str):
'duration': float(duration),
'text': text,
'normalized_text': normalized_text,
'speaker': int(speaker[3:]),
}

i += 1
Expand Down

0 comments on commit 8da3699

Please sign in to comment.