Multilingual Video Transcription using Whisper

A Python command-line tool for downloading and/or transcribing videos using OpenAI's Whisper open-source model. It supports YouTube video URLs, playlists, videos already downloaded locally, or YouTube URLs from a JSON file. It can download videos in different resolutions, and the resulting transcriptions are saved in a JSON file.

Setup

Clone the repository:

git clone https://github.com/abmami/Multilingual-Video-Transcription-using-Whisper.git
cd Multilingual-Video-Transcription-using-Whisper

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate  # On Linux
venv\Scripts\activate.bat  # On Windows

Install the required Python packages:

pip install -r requirements.txt

Install FFmpeg:
- On Ubuntu:
```
sudo apt-get install ffmpeg
```
- On Windows:
  - Download the latest static build of FFmpeg from the official website: https://ffmpeg.org/download.html#build-windows
  - Extract the downloaded ZIP file to a folder on your system.
  - Add the path to the bin folder of the extracted FFmpeg to your system's PATH environment variable

Usage

Transcribe videos from the urls JSON file in data folder using the following command:

python transcribe.py

Transcribe videos that have already been downloaded locally and stored in the folder data/videos using the following command:

python transcribe.py --locally

Transcribe a Youtube playlist using the following command:

python transcribe.py --playlist YT_PLAYLIST_URL

Transcribe a single Youtube Video using the following command:

python transcribe.py --url YT_VIDEO_URL

Additional Options

--res: The resolution of the video(s) to download (default: 360).
--no-save: Add this to delete the video(s) after transcription.

Configuration

The tool uses the following paths:

input_path: The path to the input file (default: data/urls.json).
videos_path: The path to the folder where the videos are saved (default: data/videos).
output_path: The path to the output file (default: data/output.json).

The tool also uses the Whisper's small model. The size of the small model is ~461M. You can change it in the code to use the base or another model.

model_name: The name of the Whisper model to use (default: small).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
transcribe.py		transcribe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual Video Transcription using Whisper

Setup

Usage

Additional Options

Configuration

About

Languages

License

abmami/Multilingual-Video-Transcription-using-Whisper

Folders and files

Latest commit

History

Repository files navigation

Multilingual Video Transcription using Whisper

Setup

Usage

Additional Options

Configuration

About

Topics

Resources

License

Stars

Watchers

Forks

Languages