Replies: 1 comment
-
Is it splitting in pauses? Edit: I'v just read the desc in your repo. Don't bother with a reply. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I wanted to share here a tool that you might feel helpful (or not...)
Consider the following file (librivox):
https://ia801401.us.archive.org/25/items/beckoningfairone_2211_librivox/beckoningfairone_08_onions_128kb.mp3
The file is 30 min and 46 sec.
In order to train a voice, the samples should be less than ~10 sec (cf Notes on https://github.com/voicepaw/so-vits-svc-fork) and typically more than one second.
SVC comes with a splitter:
svc pre-split
. So I put my mp3 file in dataset_raw_raw. Then,svc pre-split
spits the output in dataset_raw.The smallest file is 1.2 sec long an contain 0.4 sec of non silent audio, while the longest is 29 seconds with quiet a lot of files longer than 10 sec. Below, an histogram of the resulting lengths:
Certainly, fiddling with the parameters you can probably achieve a better result but the overall shape stays roughly the same (unless you put the threshold so high that you have a lot of very small files.)
So, I made my own audio splitter: split_audio.py in which you specify the desired average length (default is 5 sec). The default (recommended) usage is:
python audio_split.py --desired_duration <desired average duration [sec]> <path/to/your/long/audio/file>
The resulting distribution looks like this:
Does it help training or not? I don't know. I haven't compared yet
split_audio.py
andsvc pre-split
side by side.You can download it from:
https://github.com/sbersier/split_audio
(PS: I hope split_audio is not too buggy... If you find bugs, don't hesitate to report.)
Beta Was this translation helpful? Give feedback.
All reactions