preprocessing VoxCele2 is not working #488

amintavakol · 2020-08-12T21:31:06Z

While running encoder_preprocess on voxceleb2 dataset, I'm getting the following warning and nothing else happens. Not sure why?

raw: Preprocessing data for 5994 speakers.
raw:   0%|                                                                                           | 0/5994 [00:00<?, ?speakers/s]
/home/amin/.local/lib/python3.6/site-packages/librosa/core/audio.py:161: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn('PySoundFile failed. Trying audioread instead.')
/home/amin/.local/lib/python3.6/site-packages/librosa/core/audio.py:161: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn('PySoundFile failed. Trying audioread instead.')

The text was updated successfully, but these errors were encountered:

ghost · 2020-08-12T21:44:29Z

Could it be related to this? #76 (comment)

Also you might want to follow #458 since we're also running into issues with preprocessing and training. If you modify code to fix a problem would you please contribute it back as a pull request?

mbdash · 2020-08-12T21:54:02Z

I have mesed around with LibriSpeeech, VoxCeleb 1&2, CommonVoice and VCTK in #458

I am a bit tired so I am not sure if VoxCeleb was wav, ensure it is all converted to wav.
Also, once I succeed in training, a new encoder, I will share everything i can.

amintavakol · 2020-08-14T07:59:16Z

For guys who are trying to preprocess VoxCeleb2:
once you download the dataset, the audio files are in ".m4a" format. You guys need to reformat the audio files into ".wav".
Just put the following code snippet convert.sh (need to save it as .sh) in the root directory of the data (e.g. <path-to-VoxCeleb2>/raw/dev/aac)
convert.txt

then run
./convert.sh

Also, make sure you have ffmpeg installed on you machine.
You also need to modify the function preprocess_voxceleb2 in encode/preprocess.py, and change the extension to ".wav".

I'm training a new encoder with more datasets other than Libri/Vox1, 2 and will update everyone in a few days.

ghost · 2020-08-19T08:30:04Z

Presumed resolved based on #497

ghost · 2020-08-19T08:31:24Z

Please share updates about your encoder model in #458, would be interested to see how it is working. Did you modify any hparams?

ghost closed this as completed Aug 19, 2020

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preprocessing VoxCele2 is not working #488

preprocessing VoxCele2 is not working #488

amintavakol commented Aug 12, 2020

ghost commented Aug 12, 2020

mbdash commented Aug 12, 2020

amintavakol commented Aug 14, 2020 •

edited

Loading

ghost commented Aug 19, 2020

ghost commented Aug 19, 2020

preprocessing VoxCele2 is not working #488

preprocessing VoxCele2 is not working #488

Comments

amintavakol commented Aug 12, 2020

ghost commented Aug 12, 2020

mbdash commented Aug 12, 2020

amintavakol commented Aug 14, 2020 • edited Loading

ghost commented Aug 19, 2020

ghost commented Aug 19, 2020

amintavakol commented Aug 14, 2020 •

edited

Loading