-
Notifications
You must be signed in to change notification settings - Fork 684
New ffmpeg
backend changes samples when saving WAVE
#3281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think what is happening is that for WAV, ffmpeg defaults to int16, so the test is causing some discrepancy, but the discrepancy is at most in the order of It is possible to make the behavior match the previous backends, but I think there were user feedbacks that int16 is better as that's what vast majority of audio system expects and many do not understand other precision. One reason why the existing backend picked the matching precision is to preserve the data as precise as it was returned by the model for the sake of scientific computation. What do you think? @pzelasko @hwangjeff |
Good insight! I was able to validate that you're right by replacing noise generation like this: INT16MAX = 32768
noise = torch.randint(-INT16MAX, INT16MAX - 1, (1, 32000))
noise = noise / INT16MAX I think it makes sense, it's the most common format and people rarely need the actual float32 precision when saving files. I only found out because some of Lhotse unit tests for correct save->load behavior failed when moving to ffmpeg, but they used artificial data anyway. In that case you might want to update the documentation here: audio/torchaudio/_backend/utils.py Lines 533 to 550 in 151ac4d
|
🐛 Describe the bug
Snippet to reproduce the error is provided below. Adding
backend="sox"
orbackend="soundfile"
totorchaudio.save
removes the issue.Output:
Versions
Collecting environment information...
PyTorch version: 2.0.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 13.3.1 (arm64)
GCC version: Could not collect
Clang version: 14.0.3 (clang-1403.0.22.14.1)
CMake version: version 3.25.0
Libc version: N/A
Python version: 3.10.4 (main, Mar 31 2022, 03:37:37) [Clang 12.0.0 ] (64-bit runtime)
Python platform: macOS-13.3.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Apple M1 Max
Versions of relevant libraries:
[pip3] flake8==5.0.4
[pip3] k2==1.23.4.dev20230412+cpu.torch2.0.0
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.23.5
[pip3] torch==2.0.0
[pip3] torchaudio==2.0.0
[pip3] torchvision==0.15.0
[conda] k2 1.23.4.dev20230412+cpu.torch2.0.0 pypi_0 pypi
[conda] numpy 1.23.5 py310hb93e574_0
[conda] numpy-base 1.23.5 py310haf87e8b_0
[conda] pytorch 2.0.0 py3.10_0 pytorch
[conda] torch 1.12.1 pypi_0 pypi
[conda] torchaudio 2.0.0 py310_cpu pytorch
[conda] torchvision 0.15.0 py310_cpu pytorch
The text was updated successfully, but these errors were encountered: