A gradio frontend for generating transcribed or translated subtitles for videos using OpenAI Whisper locally.
This has been tested with Python 3.12
python -m venv .venv
Activate the venv on Windows with
.\.venv\Scripts\activate
On Linux and Mac, run
source .venv/bin/activate
Now, install the pip dependencies:
# if this doesn't work, pip install the following manually: openai-whisper ffmpeg torch gradio
pip install -r requirements.txt
Then launch the server with
python server.py
To share, add --remote=True
.
For embedding subtitles into a video, you need to have ffmpeg
installed on your system.
- Input a video or any other media file
- Input a YouTube URL
- Transcribe
- Translate to English
- Select different models for your hardware
- CUDA support
- Output .srt or video file with embedded subtitles
If the output says gpu available: False
you might need to pip install a different version of Torch for your specific hardware
I just wanted a nice frontend where you can just drop a video or url and it will spit out subs. Whisper is amazing but I haven't found that many implementations, especially ones that can be run locally.