Skip to content

Code for the paper Language Identification Using Deep Convolutional Recurrent Neural Networks

License

Notifications You must be signed in to change notification settings

MathurUtkarsh/Language-Identification-in-Audio-Using-Deep-CRNN

Repository files navigation

Language Identification Using Deep Convolutional Recurrent Neural Networks

Introduction

This project contains the code for the paper "Language Identification Using Deep Convolutional Recurrent Neural Networks", which will be presented at the 24th International Conference on Neural Information Processing (ICONIP 2017).

Problem Statement

The problem at hand is to develop a language identification system that can accurately predict the language of a given audio recording from among six international languages, namely English, German, French, Spanish, Chinese, and Russian. The system will take audio recordings of the specific language as input from the user and output the predicted language. The challenge lies in designing a robust and efficient system that can accurately identify the language despite variations in dialects, accents, and background noise.

Description Overview

The aim of this project is to create an accurate language identification system for six international languages using deep learning techniques. The system will analyze speech recordings and identify the spoken language based on unique phonetic and acoustic properties. The proposed solution will use a tailored deep convulutional recurrent neural network architecture for speech recognition to perform robustly under varying acoustic conditions. This project aims to provide a practical solution for automated language identification with potential applications in areas such as speech recognition, language translation, and language learning.

Technology Use

Here we will be using Anaconda Python 3.6, Keras 2.2.4 using TensorFlow GPU 1.14.0 backend CUDA 10 with CuDNN 10.

Installation

Installation of this project is pretty easy. Please do follow the following steps to create a virtual environment and then install the necessary packages in the following environment.

In Pycharm it’s easy

  1. Create a new project.
  2. Navigate to the directory of the project
  3. Select the option to create a new new virtual environment using conda with python3.6
  4. Finally create the project using used resources.
  5. After the project has been created, install the necessary packages from requirements.txt file using the command pip install -r requirements.txt

In Conda also it’s eay

  1. Create a new virtual environment using the command conda create -n your_env_name python=3.6
  2. Navigate to the project directory.
  3. Install the necessary packages from requirements.txt file using the command
    pip install -r requirements.txt

WorkFlow Diagram

Picture1

This above picture is of the project directory if we open the project folder using Pycharm.

2. SpectrogramGenerator.py

Picture3

This file is present in the dataloaders folder. SpectrogramGenerator.py is used to convert our .wav speech recored files to spectrogram images which will be used for training.

3. train.py

Picture4

This file is used to do the training of the dataset and finally we will get the trained model which will be used for prediction.

4. predict.py

Picture5

This file is used to do the prediction of the given user input with two other argument parameters.

5. clientApp.py

Picture6

This file is the flask server file and entry point of application.

6. spectogram generated

specto

Conclusion

In this project we have successfully built a language identification which can classify and identify six internatonal languages.

Comparision

Here we can do a lot of improvements. We can go with pre trained models like BERT , GPT2 etc to increase the accuracy. We can also increase the size of the training data.

About

Code for the paper Language Identification Using Deep Convolutional Recurrent Neural Networks

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages