Skip to content

Python tool that uses OpenAI's Whisper to transcribe audio/video files to text with timestamps. Features GPU acceleration, multi-language support, and automatic audio extraction.

License

Notifications You must be signed in to change notification settings

CarlosUlisesOchoa/Whisper-AI-write-video-or-audio-transcription-to-txt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper Audio/Video Transcription Tool 🎙️

A powerful Python-based transcription tool that leverages OpenAI's Whisper model to transcribe both audio and video files with GPU acceleration support.

🌟 Features

  • Supports both audio and video file transcription
  • Automatic audio extraction from video files
  • GPU acceleration with CUDA support
  • Timestamp-based transcription output
  • Multi-language support
  • Easy-to-use command line interface
  • Saves transcriptions to text files

🔧 Usage Examples

Run the script with various arguments:

# Basic usage with a video file
python main.py --input video.mp4 --output transcript.txt

# Specify a different language (default is English)
python main.py --input audio.mp3 --language spanish

# Complete example with all options
python main.py --input interview.mp4 --output transcript.txt --language english

Available arguments:

  • --input: Path to input audio/video file (required)
  • --output: Path to output text file (optional, defaults to input filename + .txt)
  • --language: Input language (optional, defaults to English)

🔧 Requirements

  • Python 3.7+
  • FFmpeg
  • CUDA-compatible GPU (optional, for faster processing)
  • Required Python packages (see requirements.txt)

🚀 Installation

  1. Clone the repository:
git clone https://github.com/CarlosUlisesOchoa/Whisper-AI-write-video-or-audio-transcription-to-txt.git
  1. Install the required Python packages:
pip install -r requirements.txt
  1. Run the script:
py main.py --input "D:\files\myfile.mp4" --language en

You'll find the transcription in the same folder with the same name as the input file but with a .txt extension.

🔑 License

About

Python tool that uses OpenAI's Whisper to transcribe audio/video files to text with timestamps. Features GPU acceleration, multi-language support, and automatic audio extraction.

Topics

Resources

License

Stars

Watchers

Forks

Languages