Skip to content

Latest commit

 

History

History
11 lines (10 loc) · 461 Bytes

README.md

File metadata and controls

11 lines (10 loc) · 461 Bytes

Audio Dataset Labelling Automation

This project aims to streamline the dataset creation and labelling processes for Automatic Speech Recognition (ASR) systems. Project consists of 3 parts.

  • AudioFile Ingestion
  • Automatic Labelling with ASR (Whisper Model)
  • Manual Labelling for improved dataset quality

WebRTC-VAD implementation is taken from this repository. https://github.com/wiseman/py-webrtcvad/blob/master/example.py