Skip to content
/ VGGish Public
forked from DTaoo/VGGish

An inplementation of vggish in keras with tf backend

Notifications You must be signed in to change notification settings

rola93/VGGish

 
 

Repository files navigation

VGGish: A VGG-like audio classification model

This repository provides a VGGish model, implemented in Keras with tensorflow backend. This repository is developed based on the model for AudioSet. For more details, please visit the slim version.

Pretrained weights in Keras h5py:

They should be downloaded into weights directory

  • Model with the top fully connected layers

  • Model without the top fully connected layers

Test it!

You can check it with a sub-set of the Speech Commands dataset

Download it and move a sub set (i.e. some folders) of the dataset.

The directory should look like this:

VGGish/dataset/data/class_01/*.wav
VGGish/dataset/data/class_02/*.wav
VGGish/dataset/data/class_03/*.wav

where class_01, class_02, and class_03 are classes from the Speech Commands dataset, and VGGish is the root of this project. You can check that the dataset is correctly placed, and see a summary by running python dataset/dataset_utils.py which, for yes/no categories, should output something like that:

        Dataset summary:

        training   ->  6358 examples
                   no ->  3130 examples
                  yes ->  3228 examples

        testing    ->   824 examples
                   no ->   405 examples
                  yes ->   419 examples

        validation ->   803 examples
                   no ->   406 examples
                  yes ->   397 examples

Then you can run it with python evaluation.py.

Here you can see an output example for yes/no categories

loading training data...
loading test data...
loading validation data...
Training...
Report for training
              precision    recall  f1-score   support

          no       0.94      0.99      0.96      2850
         yes       0.99      0.94      0.96      2990

    accuracy                           0.96      5840
   macro avg       0.96      0.96      0.96      5840
weighted avg       0.96      0.96      0.96      5840

Report for validation
              precision    recall  f1-score   support

          no       0.92      0.98      0.95       363
         yes       0.98      0.92      0.95       356

    accuracy                           0.95       719
   macro avg       0.95      0.95      0.95       719
weighted avg       0.95      0.95      0.95       719

Report for testing
              precision    recall  f1-score   support

          no       0.92      0.98      0.95       386
         yes       0.98      0.92      0.95       397

    accuracy                           0.95       783
   macro avg       0.95      0.95      0.95       783
weighted avg       0.95      0.95      0.95       783

Notice that I removed some logging info from TF which is not relevant at this point.

Reference:

About

An inplementation of vggish in keras with tf backend

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%