A local, feature-rich Gradio interface for the Kokoro open-weight Text-to-Speech model. This application provides a user-friendly web UI to generate high-quality speech, featuring parallel processing for long texts, advanced text cleaning, and automatic hardware acceleration.
- High-Quality TTS: Access to all Kokoro voices (US & UK accents).
- Parallel Processing: Splits long text into chunks and processes them in parallel threads for significantly faster generation.
- Hardware Acceleration: Automatically detects and uses NVIDIA GPU (CUDA) if available, falling back to CPU seamlessly.
- Text Preprocessing:
- Reference number removal (e.g.,
[1]). - Whitespace normalization.
- Initial formatting (e.g., converting "J.R.R." to "J R R").
- Reference number removal (e.g.,
- Tokenization Preview: View the phonemes/tokens generated by the model before audio synthesis.
- Sample Library: Quick access to sample texts (Great Gatsby, Frankenstein) or random quotes.
Before running the application, ensure you have the following installed:
- Python 3.8+
- eSpeak-ng: Required for phonemization.
- Windows:
- Download and install from eSpeak-ng releases.
- 🧩 PowerShell Commands (Run as Administrator)
Copy and paste these commands one by one in PowerShell after installing eSpeak NG:
$env:PHONEMIZER_ESPEAK_LIBRARY = "c:\Program Files\eSpeak NG\libespeak-ng.dll" $env:PHONEMIZER_ESPEAK_PATH = "c:\Program Files\eSpeak NG" setx PHONEMIZER_ESPEAK_LIBRARY "c:\Program Files\eSpeak NG\libespeak-ng.dll" setx PHONEMIZER_ESPEAK_PATH "c:\Program Files\eSpeak NG"
- Linux:
sudo apt-get install espeak-ng - Mac:
brew install espeak
- Windows:
Note: You do not need to clone the GitHub repository. You only need the app.py script.
-
Create a Folder: Manually create a new folder (e.g., named
kokoro) on your computer. -
Download Script: Download
app.pyand place it inside this folder. -
Set up a Virtual Environment: Open your terminal or command prompt inside this folder and run:
- Windows:
python -m venv venv venv\Scripts\activate
- Linux/Mac:
python3 -m venv venv source venv/bin/activate
- Windows:
-
Install Dependencies: Install the required packages, including the Kokoro library:
pip install gradio torch nltk phonemizer scipy soundfile kokoro>=0.9.4(Note: If you have a specific CUDA version, install the appropriate version of PyTorch from pytorch.org.)
- Ensure your virtual environment is activated.
- Run the application:
python app.py
- The application will automatically launch in your default web browser.
- Voice: Select from a variety of US (Heart, Bella, Michael, etc.) and UK (Emma, George, Lewis) voices.
- Speed: Adjust the speaking rate (0.5x to 2.0x).
- Text Cleaning: Toggle specific preprocessing steps in the "Text Cleaning Options" accordion.
- Parallel Processing: Located in the accordion settings.
- Slider (1-10): Controls how many text chunks are processed simultaneously.
- Tip: Higher values use more RAM/VRAM. If you encounter "Out of Memory" errors, reduce this slider.
- NLTK Errors: The app attempts to download necessary NLTK data (
punkt) automatically. If this fails, runimport nltk; nltk.download('punkt')in a Python shell. - eSpeak Errors: If you see errors related to
EspeakWrapper, ensureespeak-ngis installed and added to your system's PATH. The app includes a monkey-patch to help locate it in standard environments.
├── app.py # Main application file
├── kokoro-v0_19.pth # Model weights (Downloaded automatically on first run)
├── venv/ # Virtual environment folder (created during installation)
├── en.txt # (Optional) Source for random quotes
├── gatsby5k.md # (Optional) Sample text
└── frankenstein5k.md # (Optional) Sample text
This project relies on the Kokoro TTS model. Please refer to the original model's license for usage restrictions.
- 🛠️ @yl4579 for architecting StyleTTS 2.
- 🏆 @Pendrokar for adding Kokoro as a contender in the TTS Spaces Arena.
- 📊 Thank you to everyone who contributed synthetic training data.
- ❤️ Special thanks to all compute sponsors.
- 👾 Discord server: https://discord.gg/QuGxSWBfQy
- 🪽 Kokoro is a Japanese word that translates to "heart" or "spirit". Kokoro is also a character in the Terminator franchise along with Misaki.
