-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quality Benchmarks Between audiotok / webrtcvad / silero-vad #32
Comments
InstrumentsWe have compared 3 easy-to-use off-the-shelf instruments for voice activity / audio activity detection:
Caveats
MethodologyPlease refer here - https://github.com/snakers4/silero-vad#vad-quality-metrics-methodology Quality BenchmarksFinished tests: Portability and Speed
This is by no means an extensive and full research on the topic, please point out if anything is lacking. |
Nice, thanks for sharing! Its main strengths are a flexible and intuitive API for working with time (duration of speech an silence) and the ability to run online. The default detection algorithm can easily be replaced by a user-provided algorithm (see the |
Maybe it is just non optimal standard params, maybe it is our validation which is just calls annotated by STT and then hand checked The only real way to find out is to share the results and see how other people measure their vads As for usage of silero-vad as an engine - we deliberately kept it simple and omitted even module packaging because if you look past the data loading bits, it is literally loaded with 1 command I am not sure yet how to better package it better |
Here I will post our benchmarks comparing these three instruments
The text was updated successfully, but these errors were encountered: