Train voice having 44Khz sampling rate #604

donlk · 2024-09-15T13:10:24Z

Hi!
I have appr. 1.5 hours of audio voice at 44Khz and like to train a usable model from it. I don't want to retrain, as the pre-trained checkpoints are all 22Khz, sounding muddy and not that good.
I tried training from scratch, specifying the correct sampling_rate of 44100. Reached 2000 epochs, but the inferred audio was way too fast, skipping words in the process.

What should I modify or patch in to make this work?

thanks!

agonzalezd · 2024-09-17T13:19:00Z

i suggest resampling your data to 22050 Hz. you can use ffmpeg to do so

donlk · 2024-09-18T23:15:21Z

I would abstain from that if possible, due to huge quality loss.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train voice having 44Khz sampling rate #604

Train voice having 44Khz sampling rate #604

donlk commented Sep 15, 2024

agonzalezd commented Sep 17, 2024

donlk commented Sep 18, 2024

Train voice having 44Khz sampling rate #604

Train voice having 44Khz sampling rate #604

Comments

donlk commented Sep 15, 2024

agonzalezd commented Sep 17, 2024

donlk commented Sep 18, 2024