A robust Python toolkit for converting video/audio content into accurate, multilingual subtitles using WhisperX for transcription and Google's Gemini API for proofreading and translation.
- 🎯 High-quality transcription using WhisperX with word-level alignment
- 🔍 AI-powered proofreading with Gemini to fix transcription errors
- 🌍 Multilingual translation support
- 📥 Support for HLS streams, direct file URLs, and local files
- 🎵 Audio fingerprinting using Shazam (macOS only)
- 📊 Progress tracking with rich terminal output
- Python 3.10 or higher
- FFmpeg installed on your system
pip install sub-toolsexport GEMINI_API_KEY={your_api_key}
# Full pipeline: download video, extract audio, transcribe, proofread, and translate
sub-tools -i https://example.com/video.mp4 --languages en es fr
# Using HLS stream URL
sub-tools -i https://example.com/hls/video.m3u8 --languages en es fr
# Using local audio file (skip video/audio tasks)
sub-tools --tasks transcribe translate --audio-file audio.mp3 --languages en es fr
# Only transcribe without translation
sub-tools --tasks transcribe --audio-file audio.mp3 --languages en
# Specify custom tasks (available: video, audio, signature, transcribe, translate)
sub-tools -i https://example.com/video.mp4 --tasks video audio transcribe translate --languages en es
# Specify a custom Gemini model (default: gemini-3-pro-preview)
sub-tools -i https://example.com/video.mp4 --languages en --model gemini-2.5-pro
# Specify output directory (default: output)
sub-tools -i https://example.com/video.mp4 --languages en --output my-subtitlesThe tool operates as a multi-stage pipeline controlled by the --tasks parameter:
- video: Downloads media from URL (HLS or direct) →
video.mp4 - audio: Extracts audio track →
audio.mp3 - signature: Generates Shazam signature for fingerprinting (macOS only)
- transcribe: Transcription using WhisperX →
transcript.srt - translate: Proofreads and translates to target languages using Gemini →
{language}.srt
By default, all tasks run. You can customize which tasks to run with --tasks.
docker build -t sub-tools .
docker run -v $(pwd)/output:/app/output sub-tools sub-tools --gemini-api-key GEMINI_API_KEY -i URL -l enContributions are welcome! Please see CONTRIBUTING.md for detailed guidelines.
# Install uv package manager
# https://github.com/astral-sh/uv
# Clone and setup
git clone https://github.com/dohyeondk/sub-tools.git
cd sub-tools
uv syncuv run pytest -m "not slow"This project is licensed under the MIT License - see the LICENSE file for details.