Skip to content

Varsha-Kini/Targeted-Voice-Separation

Repository files navigation

Targeted Voice Separation

Team members:

  1. Aakanksha Desai
  2. Varsha Kini
  3. Vrunda Mange

Abstract:

The research focuses on "Targeted Voice Separation" utilizing advanced neural networks to tackle the Cocktail Party Problem, aiming to isolate a specific speaker's voice from mixed audio recordings. Using the Librispeech dataset, the study implements the U-Net architecture for speaker separation, achieving a Signal to Distortion ratio (SDR) of 7.09 dB. The system successfully identifies the target speaker through voice comparison with the Resemblyzer library. This approach demonstrates promising results in effectively separating mixed audio sources with minimal distortion, suggesting potential for further improvements through dataset expansion and exploration of training data size impacts on audio quality.

Publication Link: https://ijisrt.com/targeted-voice-separation

Paper Link : https://github.com/Varsha-Kini/BE_Project/blob/094af412e021358f63664018a05daf92d4f6668a/Targeted%20Voice%20Separation%20Published%20Paper.pdf

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages