Multiclass Speech Classification

This repository contains code and associated files for classifying speech recordings.

Overview

The task is to train a classifier to recognize which of several English words is pronounced in an audio recording.

Data

The dataset used for this task can be found at https://surfdrive.surf.nl/files/index.php/s/A91xgk7B5kXNvfJ which contains the following files:

wav.tgz: a compressed directory with all the recordings (training and test data) in the form of wav files.
feat.npy: an array with Mel-frequency Cepstral Coefficients (MFCC) extracted from each wav file. The features at index i in this array were extracted from the wav file at index i of the array in the file path.npy.
path.npy: an array with the order of wav files in the feat.npy array.
train.csv: this file contains two columns: path with the filename of the recording and word with word which was pronounced in the recording. This is the training portion of the data.
test.csv: This is the testing portion of the data, and it has the same format as the file train.csv except that the column word is absent.

In this task, the classifier examines the MFCC features extracted and performs a multiclass classification; labeling each feature as one of 35 English words

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Code.ipynb		Code.ipynb
Preprocessing.py		Preprocessing.py
README.md		README.md
featureEngineering.py		featureEngineering.py
modelAsset.py		modelAsset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multiclass Speech Classification

Overview

Data

About

Releases

Packages

Languages

unangity/multiclass-speech-classification

Folders and files

Latest commit

History

Repository files navigation

Multiclass Speech Classification

Overview

Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages