Skip to content

cbro33/Faster-Whisper-XXL-GUI

Repository files navigation

Faster Whisper XXL GUI

AMOLED Theme Screenshot

Faster Whisper XXL GUI is a desktop interface for the Faster Whisper XXL transcription engine. It supports local files, YouTube downloads, and a wide range of output formats with configurable VAD/audio settings.

Features

  • File and YouTube transcription (audio-only or full video).
  • Automatic dependency setup (Faster Whisper XXL + FFmpeg).
  • Model/task/language controls plus VAD and audio options.
  • Model Manager with custom HF/local models and Transformers -> CT2 conversion.
  • Multiple output formats (SRT, VTT, JSON, TXT, etc.).
  • Light/Dark/AMOLED themes.
  • Persistent settings.

Quick Start (Windows)

  1. Download the latest .exe from the Releases page.
  2. Run it (no installation required).
  3. On first launch, accept the prompt to download and set up Faster Whisper XXL + FFmpeg.

Manual yt-dlp Updates (No Python)

  1. Download the latest yt-dlp.exe from the official yt-dlp releases page.
  2. Place it in a stable folder (or anywhere on your PATH).
  3. In the app, go to Settings → yt-dlp and set Source to EXE (custom or PATH), then browse to the file.
  4. To update later, replace that yt-dlp.exe with a newer one.

Run From Source

  1. Install Python 3.8+ and pip.
  2. Clone and install:
    git clone https://github.com/cbro33/Faster-Whisper-XXL-GUI.git
    cd Faster-Whisper-XXL-GUI
    pip install -r requirements.txt
    If you want Transformers model conversion from source:
    pip install ctranslate2 transformers[torch] safetensors sentencepiece
  3. Launch:
    python src/faster-whisper-xxl-gui.py

Manual Setup (If Auto Download Fails)

Auto Setup is still a WIP and may not work all the time on every machine. If there are issues, you can do a manual installation. Download the standalone Faster Whisper XXL archive and extract its contents into the app bin folder.

If extraction fails on Windows, install 7-Zip.

Usage

  1. Add files in the File tab or provide a URL in yt-dlp.
  2. Adjust settings in Global Settings, Advanced, VAD, or Audio tabs.
  3. Manage models in Manage Models (download, import, enable, verify).
  4. Click Run and check the console output for progress.
  5. Outputs are saved to your chosen output directory (defaults to output in the app folder).

Custom Models (HF + Local)

Open Manage Models to add custom models from Hugging Face or import local CT2 folders.

  • HF repos with model.bin (CTranslate2) download directly.
  • HF repos with model.safetensors / pytorch_model.bin will prompt to convert to CT2.
    • EXE: downloads a converter bundle (~250 MB) once.
    • Source: uses your current Python environment (install deps above).
  • Advanced setting: Converter Python lets you point conversion at a specific Python (useful for conda).

Docs

Detailed options and hardware guidance live in the Wiki.

Contributing

Issues and pull requests are welcome.

License

This project uses the GNU GPL 3. See LICENSE.

About

GUI for Faster‑Whisper‑XXL transcription tool: download YouTube audio, transcribe local files, manage models, and export multiple formats with themes and auto yt‑dlp updates.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages