Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: pin silero-vad version to v4 #115

Merged
merged 1 commit into from
Sep 1, 2024

Conversation

slaesh
Copy link
Contributor

@slaesh slaesh commented Sep 1, 2024

seems there is a new version (v5) which cant handle sample sizes other than 512 for the 16kHz.
let fix it for now to not confuse others who are trying this repo :)

@Gldkslfmsd
Copy link
Collaborator

OK, thanks. Citation for your claim needed -- I found the reference here: snakers4/silero-vad#2 (comment)

So this is a hotfix but we should prefer v5 and make it working with the fixed sample window size, right? Do you know how?

@slaesh
Copy link
Contributor Author

slaesh commented Sep 1, 2024

VAD now works with 8 kHz and 16 kHz sample rates, only with fixed 256 and 512 sample windows respectively;

no unfortunately not, that's why I came up with this first ;)

but yeah, v5 sounds promising though. might try to chunk the audio and test individual chunks and see if at least one has voice

@Gldkslfmsd
Copy link
Collaborator

no unfortunately not, that's why I came up with this first ;)

Then please report it in Siler repo. Thanks!

but yeah, v5 sounds promising though. might try to chunk the audio and test individual chunks and see if at least one has voice

ok, thanks. Please create an issue or a new PR for that.

@slaesh
Copy link
Contributor Author

slaesh commented Sep 1, 2024

no unfortunately not, that's why I came up with this first ;)

Then please report it in Siler repo. Thanks!

what exactly? the issue lies here. the code is working ONLY with v4, but we do not specify the version right now. so we need to do this hotfix until we are ready for v5. :)

@Gldkslfmsd
Copy link
Collaborator

VAD now works with 8 kHz and 16 kHz sample rates, only with fixed 256 and 512 sample windows respectively;

no unfortunately not, that's why I came up with this first ;)

OK, I thought Siler VAD doesn't work.

@Gldkslfmsd Gldkslfmsd merged commit 225f038 into ufal:main Sep 1, 2024
@Gldkslfmsd Gldkslfmsd mentioned this pull request Sep 1, 2024
@Gldkslfmsd Gldkslfmsd mentioned this pull request Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants