Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers #814

pzelasko · 2022-09-20T03:16:11Z

This PR went a little out of scope, I didn't originally intend the refactoring but the audio errors were too confusing even for me at this point. I think I will follow-up with a similar refactoring that @desh2608 did in #820 to improve the code readability. I might also further refactor the "audio backend" thing in Lhotse, apparently it grew quite complicated as it tries to support multiple torchaudio versions, multiple torchaudio backends, libsoundfile, audioread, and custom hacks for things like OPUS and SPHERE, and now also custom hacks for in-memory buffers. I'll need to think if the solution can be made a bit more elegant.

BTW I tried the suggested workaround solution to pytorch/audio#2662 by using the new torchaudio ffmpeg streamer, but I had some issues with some (not all) in-memory buffers saying that seek operation is not permitted, but I could not easily create a reproducible example yet. Maybe I'll revisit this in the future.

…ing soundfile for saving to BytesIO

…ffmpeg-streamer

desh2608 · 2022-09-25T15:11:02Z

Nice! I think most users (me, for one) would care more about what functionalities a Recording or RecordingSet provide, compared to nitty-gritties of how the audio is loaded in the backend. Probably best to separate these 2 concepts.

…ing to in-memory buffers (lhotse-speech#814) * Reading audio with torchaudio ffmpeg streamer * Workaround for broken FLAC BytesIO saving * Resolve CI errors by checking min torchaudio version >= 0.9 before using soundfile for saving to BytesIO * Refactor audio loading logic for better extensibility and error display * Fix CI * Fix CI * Prefer libsndfile for in-memory buffer data * Add a minimum amount of documentation

@pzelasko

…822) * initial commit for multi-channel supervisions * added base.py * add mono.py * add mixed.py * add padding.py * add set.py * add init file * fix isort * initial commit for MultiCut * add type hints for is_equal_or_contains * fix flake8 issues * fix flake8 in mono.py * more changes for MultiCut * added base.py * add mono.py * add mixed.py * add padding.py * add set.py * add init file * fix isort * fix flake8 issues * fix flake8 in mono.py * Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers (#814) * Reading audio with torchaudio ffmpeg streamer * Workaround for broken FLAC BytesIO saving * Resolve CI errors by checking min torchaudio version >= 0.9 before using soundfile for saving to BytesIO * Refactor audio loading logic for better extensibility and error display * Fix CI * Fix CI * Prefer libsndfile for in-memory buffer data * Add a minimum amount of documentation * initial commit for multi-channel supervisions * initial commit for MultiCut * add type hints for is_equal_or_contains * more changes for MultiCut * more changes for MultiCut * more changes for MultiCut * add type attribute to MixTrack to make from_dict work * fix rir test case * all old tests passing * fix isort * revert asdict_nonull * more changes for MultiCut * remove voxceleb changes * fixed save_audio * add tests for multi cut augmentation * add tests for drop attributes * more tests for multi cut * more tests for multi cut * fix isort * fix test cases; all passing * merge_supervisions implemented for each cut type * add tests for mixing with multi cuts * update feature mixing * fix failing test * incorporate suggestions from @pzelasko * fix mixing test * add serialization test for MultiCut * add multi cut fixture * added tests for audio mixer * remove redundant cases in audio mixer * test for mixing mixed cut with multi cut Co-authored-by: Piotr Żelasko <[email protected]>

pzelasko added 8 commits September 19, 2022 23:15

Reading audio with torchaudio ffmpeg streamer

f709c78

Workaround for broken FLAC BytesIO saving

bfd728b

Resolve CI errors by checking min torchaudio version >= 0.9 before us…

3c8ec55

…ing soundfile for saving to BytesIO

Refactor audio loading logic for better extensibility and error display

8a7c994

Fix CI

7204a24

Fix CI

5b6b1a9

Prefer libsndfile for in-memory buffer data

4f4c22e

Merge branch 'master' into feature/torchaudio-ffmpeg-streamer

b4cbac5

pzelasko marked this pull request as ready for review September 25, 2022 03:49

pzelasko added 2 commits September 25, 2022 10:37

Merge remote-tracking branch 'origin/master' into feature/torchaudio-…

f61ed4d

…ffmpeg-streamer

Add a minimum amount of documentation

aa1cd39

pzelasko changed the title ~~Reading audio with torchaudio ffmpeg streamer~~ Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers Sep 25, 2022

pzelasko merged commit e08965e into master Sep 25, 2022

pzelasko added this to the v1.8 milestone Sep 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers #814

Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers #814

pzelasko commented Sep 20, 2022 •

edited

Loading

desh2608 commented Sep 25, 2022

Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers #814

Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers #814

Conversation

pzelasko commented Sep 20, 2022 • edited Loading

desh2608 commented Sep 25, 2022

pzelasko commented Sep 20, 2022 •

edited

Loading