Neural Network VAD

This has only been tested and is only expected to work using Python 3.6 [64 bit] due to tensorflow requirement.

A package to perform Speech/Non-speech Identification (SNI) using Neural Networks on .wav files.

The function that performs SNI is in VAD.py and is called Neural_Network_VAD. Provided the speech signal (as a 1-D array) and the sampling frequency, it returns as a tuple the SNI results from prediction using a Convolution-LSTM-Dense Neural Network (0 index of tuple) and a LSTM-Dense Neural Network (1st index of tuple).

VAD_script.py is a wrapper that will save the results of SNI as a plot in a .png file and as a csv file.

use: python VAD_script.py sample.wav

for help: python VAD_script.py -h

A sample wav file has been include in which the background noise has been obtained from [1] and the speech from [2].

References

[1] Koenig, M. (2018). Street Sounds | Effects | Sound Bites | Sound Clips from SoundBible.com. [online] Soundbible.com. Available at: http://soundbible.com/2175-Street.html [Accessed 14 Jun. 2018].

[2] Fromtexttospeech.com. (2018). From Text To Speech - Free online TTS service. [online] Available at: http://www.fromtexttospeech.com/ [Accessed 14 Jun. 2018].

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
voicebox		voicebox
Conv1D_LSTM_1_frame.h5		Conv1D_LSTM_1_frame.h5
Conv_LSTM.h5		Conv_LSTM.h5
LICENSE		LICENSE
LSTM.h5		LSTM.h5
LSTM_1_frame.h5		LSTM_1_frame.h5
README.md		README.md
VAD.py		VAD.py
VAD_script.py		VAD_script.py
requirements.txt		requirements.txt
sample.wav		sample.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Network VAD

References

About

Releases

Packages

Languages

License

uarif1/Neural_Network_VAD

Folders and files

Latest commit

History

Repository files navigation

Neural Network VAD

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages