-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example app VAD default + memory reduction #217
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Co-authored-by: keleftheriou <[email protected]>
- Reduces peak memory by doing the array conversion while loading in chunks so the array copy size is lower - Previously copied the entire buffer which spiked the memory 2x
- Optional cli commands are deprecated - @_disfavoredOverload required @available to prevent infinite loop
a2they
approved these changes
Oct 8, 2024
Co-authored-by: Andrey Leonov <[email protected]>
Based on this would it be better to not use VAD if we want quality? Since the model handles speech start/end, isn't the only benefit of VAD essentially not processing silent audio and not splitting the text arbitrarily at periodic intervals |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR sets the example app and CLI to use VAD as the default setting. VAD uses a lot of memory for async predicitons so this also includes some improvements to memory / thread handling in general. There is future work to improve this (see #209) but I'm including one of @keleftheriou's fixes here in 2770d84.
Memory issue detail
For very large files, there was a large spike as it was copied into a float array for the model to consume (peaks at 2gb):
Now it will directly convert the audio into a float array in chunks to mitigate this (never goes above 1gb):
This has a speed reduction of about 20% to process the full file, which can surely be improved.