-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers #814
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ing soundfile for saving to BytesIO
pzelasko
changed the title
Reading audio with torchaudio ffmpeg streamer
Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers
Sep 25, 2022
Nice! I think most users (me, for one) would care more about what functionalities a |
desh2608
pushed a commit
to desh2608/lhotse
that referenced
this pull request
Sep 25, 2022
…ing to in-memory buffers (lhotse-speech#814) * Reading audio with torchaudio ffmpeg streamer * Workaround for broken FLAC BytesIO saving * Resolve CI errors by checking min torchaudio version >= 0.9 before using soundfile for saving to BytesIO * Refactor audio loading logic for better extensibility and error display * Fix CI * Fix CI * Prefer libsndfile for in-memory buffer data * Add a minimum amount of documentation
pzelasko
added a commit
that referenced
this pull request
Oct 5, 2022
…822) * initial commit for multi-channel supervisions * added base.py * add mono.py * add mixed.py * add padding.py * add set.py * add init file * fix isort * initial commit for MultiCut * add type hints for is_equal_or_contains * fix flake8 issues * fix flake8 in mono.py * more changes for MultiCut * added base.py * add mono.py * add mixed.py * add padding.py * add set.py * add init file * fix isort * fix flake8 issues * fix flake8 in mono.py * Audio backend refactoring and a workaround for FLAC reading from/writing to in-memory buffers (#814) * Reading audio with torchaudio ffmpeg streamer * Workaround for broken FLAC BytesIO saving * Resolve CI errors by checking min torchaudio version >= 0.9 before using soundfile for saving to BytesIO * Refactor audio loading logic for better extensibility and error display * Fix CI * Fix CI * Prefer libsndfile for in-memory buffer data * Add a minimum amount of documentation * initial commit for multi-channel supervisions * initial commit for MultiCut * add type hints for is_equal_or_contains * more changes for MultiCut * more changes for MultiCut * more changes for MultiCut * add type attribute to MixTrack to make from_dict work * fix rir test case * all old tests passing * fix isort * revert asdict_nonull * more changes for MultiCut * remove voxceleb changes * fixed save_audio * add tests for multi cut augmentation * add tests for drop attributes * more tests for multi cut * more tests for multi cut * fix isort * fix test cases; all passing * merge_supervisions implemented for each cut type * add tests for mixing with multi cuts * update feature mixing * fix failing test * incorporate suggestions from @pzelasko * fix mixing test * add serialization test for MultiCut * add multi cut fixture * added tests for audio mixer * remove redundant cases in audio mixer * test for mixing mixed cut with multi cut Co-authored-by: Piotr Żelasko <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR went a little out of scope, I didn't originally intend the refactoring but the audio errors were too confusing even for me at this point. I think I will follow-up with a similar refactoring that @desh2608 did in #820 to improve the code readability. I might also further refactor the "audio backend" thing in Lhotse, apparently it grew quite complicated as it tries to support multiple torchaudio versions, multiple torchaudio backends, libsoundfile, audioread, and custom hacks for things like OPUS and SPHERE, and now also custom hacks for in-memory buffers. I'll need to think if the solution can be made a bit more elegant.
BTW I tried the suggested workaround solution to pytorch/audio#2662 by using the new torchaudio ffmpeg streamer, but I had some issues with some (not all) in-memory buffers saying that seek operation is not permitted, but I could not easily create a reproducible example yet. Maybe I'll revisit this in the future.