File contains data in an unknown format. #214

peak1995 · 2019-11-15T07:20:50Z

Dear sir:
I'm reproducing your code and meeting a problem. The same corpus and code as yours,but perform"python encoder_preprocess.py [dataroot]",error in figure above.

msebi · 2019-11-15T17:28:49Z

How did you manage to get pytorch to work? I keep running into:

https://pytorch.org/docs/stable/notes/windows.html#import-error

NikkLeiz · 2019-11-15T20:07:11Z

have same problem

Interactive generation loop
Reference voice: enter an audio filepath of a voice to be cloned (mp3, wav, m4a, flac, ...):
F:\Users\SuperPC\Desktop\Real-Time-Voice-Cloning-master\test.mp3
Traceback (most recent call last):
File "demo_cli.py", line 130, in
preprocessed_wav = encoder.preprocess_wav(in_fpath)
File "F:\Users\SuperPC\Desktop\Real-Time-Voice-Cloning-master\encoder\audio.py", line 28, in preprocess_wav
wav, source_sr = librosa.load(fpath_or_wav, sr=None)
File "C:\Users\nikkl\Anaconda3\lib\site-packages\librosa\core\audio.py", line 149, in load
six.reraise(*sys.exc_info())
File "C:\Users\nikkl\Anaconda3\lib\site-packages\six.py", line 693, in reraise
raise value
File "C:\Users\nikkl\Anaconda3\lib\site-packages\librosa\core\audio.py", line 129, in load
with sf.SoundFile(path) as sf_desc:
File "C:\Users\nikkl\Anaconda3\lib\site-packages\soundfile.py", line 627, in init
self._file = self._open(file, mode_int, closefd)
File "C:\Users\nikkl\Anaconda3\lib\site-packages\soundfile.py", line 1182, in _open
"Error opening {0!r}: ".format(self.name))
File "C:\Users\nikkl\Anaconda3\lib\site-packages\soundfile.py", line 1355, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'F:\Users\SuperPC\Desktop\Real-Time-Voice-Cloning-master\test.mp3': File contains data in an unknown format.

NiloyPurkait · 2019-11-16T15:02:33Z

How did you manage to get pytorch to work? I keep running into:

https://pytorch.org/docs/stable/notes/windows.html#import-error

managed to solve this by installing cuda 10.1 (had cuda 10.0, and VS2017 before), hope this helps

NiloyPurkait · 2019-11-16T15:20:23Z

The 'demo_cli.py' only uses the ''preprocess_wav" function, which is imported from the encoder.inference script and is defined in the encoder.audio script. It seems preprocessing functions for other file formats are not included yet.

TLDR: convert your files to .wav format to resolve

golecha · 2019-11-18T11:09:31Z

Facing the same issue will performing"python encoder_preprocess.py [dataroot]" as the data in VoxCeleb2 is in .m4a files
Also tried running this command "sndfile-play video.m4a" Give the same error : File contains data in an unknown format.

Titaniumtown · 2019-11-19T15:06:03Z

I know the solution is to convert the files to a .wav file, but it takes up way more space; is there a way to remove that requirement of a .wav file?

Port0r · 2019-11-20T22:51:59Z

Even in audio.py it says:

param fpath_or_wav: either a filepath to an audio file (many extensions are supported, not just .wav), either the waveform as a numpy array of floats.

And yet...
Well, with ffmpeg and pydub one might alter the script, so it converts any mp3 to wav on the fly. It's a dirty work-around and not time efficient, though...

Artem-B · 2019-11-23T21:03:19Z

The problem is that this fallback path in librosa does not work due to the fact that path is pathlib.WindowsPath: https://github.com/librosa/librosa/blob/master/librosa/core/audio.py#L145

Converting path to string allows file opening with audioread and it will use ffmpeg to load the audio.

NealWalters · 2019-11-24T23:23:55Z

Artem-B - Thanks for your post. I fixed line 145 as shown below, now on to the next error.
Fix:

        if isinstance(str(path), six.string_types):
            warnings.warn('PySoundFile failed. Trying audioread instead.')

And now it's giving the warning on the next line:
warnings.warn('PySoundFile failed. Trying audioread instead.')
Caught exception: TypeError("argument of type 'WindowsPath' is not iterable")

So I changed next line as well:
y, sr_native = __audioread_load(str(path), offset, duration, dtype)

NOTE: I made the mistake of selecting a 30 minute mp3 file, so it takes a while for it to convert to mono, then do the resampling. But finally got it working after 6 or more hours of tinkering with the installs.

So adding some prints to audio.py can at least show what's happening:

    if mono:
        print ("to_mono")
        y = to_mono(y)

    if sr is not None:
        print ("starting resample function")
        y = resample(y, sr_native, sr, res_type=res_type)
        print ("resampling done")

mhilmiasyrofi · 2019-12-17T03:27:52Z

Your problem is related to the librosa bug in the latest version. You can refer to this link

I solve it by converting the path into String
Python 3.6+: path = os.fspath(path)
Python 3.4+: path = str(path)

saraBadawi · 2019-12-21T18:31:18Z

Update:-
I should've put the audio files in the same folder of the script but after that I found
Synthesizing the waveform:
{| ████████████████ 104500/105600 | Batch Size: 11 | Gen Rate: 12.1kHz | }Caught exception: PortAudioError('Error querying device -1',)
can anyone face this before.

I've run the script and when I try to upload a file .mp3
I got Caught exception: RuntimeError("Error opening '.': File contains data in an unknown format.",)
and when I try to upload .wav file I got
Caught exception: RuntimeError("Error opening '/home/ubuntu/hello-this-my1576942509.wav': System error.",)
Is there something I need to do I put the voices in /home/ubuntu
What should I do? and I am using only the terminal

SquallAlex · 2019-12-22T14:19:28Z

I used the gui in windows and it worked with wav files. mp3 is supported but I got the same error. Not sure what to do. Maybe while having a library already working, make a folder name it whatever inside and place some wav and mp3 m4a files try the gui ?

saraBadawi · 2019-12-22T16:25:31Z

I have solved with adding --no_sound arguments to my terminal sudo python3 demo_cli.py --no_sound
That's give me the demo_output_00.wav.

ghost · 2020-07-10T15:56:21Z

Several problems are mentioned in this issue:

RuntimeError: Error opening 'filename.mp3': File contains data in an unknown format. (resolved by README.md update #414 to instruct users to install ffmpeg to avoid underlying Audioread: NoBackendError)
Need to convert Path objects to str (resolved by Convert pathlib objects to str when used as argument to librosa.load() #371)
Caught exception: PortAudioError('Error querying device -1',) when audio playback is attempted in demo_cli.py (resolved by Catch PortAudioError exception so it is not fatal #417)

They have all been resolved in the latest code, so we can finally close this out. Thanks to all here who reported an issue or shared workarounds.

skr3178 · 2022-11-23T22:36:30Z

sudo apt-get install libportaudio2
helped solve my issue

This was referenced Jul 10, 2020

Audioread "no backend" issue #410

Closed

Catch PortAudioError exception so it is not fatal #417

Merged

ghost closed this as completed Jul 10, 2020

pilnyjakub mentioned this issue Sep 23, 2021

Training Tutorial #819

Closed

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File contains data in an unknown format. #214

File contains data in an unknown format. #214

peak1995 commented Nov 15, 2019

msebi commented Nov 15, 2019

NikkLeiz commented Nov 15, 2019 •

edited

Loading

NiloyPurkait commented Nov 16, 2019

NiloyPurkait commented Nov 16, 2019

golecha commented Nov 18, 2019

Titaniumtown commented Nov 19, 2019

Port0r commented Nov 20, 2019

Artem-B commented Nov 23, 2019

NealWalters commented Nov 24, 2019 •

edited

Loading

mhilmiasyrofi commented Dec 17, 2019

saraBadawi commented Dec 21, 2019 •

edited

Loading

SquallAlex commented Dec 22, 2019

saraBadawi commented Dec 22, 2019 •

edited

Loading

ghost commented Jul 10, 2020

skr3178 commented Nov 23, 2022

File contains data in an unknown format. #214

File contains data in an unknown format. #214

Comments

peak1995 commented Nov 15, 2019

msebi commented Nov 15, 2019

NikkLeiz commented Nov 15, 2019 • edited Loading

NiloyPurkait commented Nov 16, 2019

NiloyPurkait commented Nov 16, 2019

golecha commented Nov 18, 2019

Titaniumtown commented Nov 19, 2019

Port0r commented Nov 20, 2019

Artem-B commented Nov 23, 2019

NealWalters commented Nov 24, 2019 • edited Loading

mhilmiasyrofi commented Dec 17, 2019

saraBadawi commented Dec 21, 2019 • edited Loading

SquallAlex commented Dec 22, 2019

saraBadawi commented Dec 22, 2019 • edited Loading

ghost commented Jul 10, 2020

skr3178 commented Nov 23, 2022

NikkLeiz commented Nov 15, 2019 •

edited

Loading

NealWalters commented Nov 24, 2019 •

edited

Loading

saraBadawi commented Dec 21, 2019 •

edited

Loading

saraBadawi commented Dec 22, 2019 •

edited

Loading