Skip to content

ONNX model detect soft hum as speech #164

Answered by snakers4
wciurzynski asked this question in Q&A
Discussion options

You must be logged in to vote

If you renormalize this audio (the model does this internally) - you get this:

I can hear some microphone / wind (?) artefacts, this is probably why network gets triggered. But during the white noise it gets un-triggered.

The probability chart looks like this:

wav = read_audio('possible_speech_but_noise.wav', sampling_rate=SAMPLING_RATE)
wav *= 1 / wav.max()
# get speech timestamps from full audio file
speech_timestamps = get_speech_timestamps(wav, model,
                                          sampling_rate=SAMPLING_RATE,
                                          visualize_probs=True,
                                          window_size_samples=1024,
                   …

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@snakers4
Comment options

@wciurzynski
Comment options

@snakers4
Comment options

@wciurzynski
Comment options

Answer selected by snakers4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #163 on January 26, 2022 14:45.