Push-to-talk voice transcription using Faster-Whisper. Supports Windows, macOS, and Linux.
-
Start the app:
uv run run.py
-
In the app:
- The server auto-starts on launch.
- Choose a Model and Input Mode (Live or Full Capture).
- Press Cmd+G (macOS) to start/stop recording.
- On Windows, Win+G may be intercepted by Xbox Game Bar; if so, use the in-app Start Recording button or disable the Game Bar shortcut in Windows settings.
- Text types into your active window automatically.
If you want to install it as a global tool:
uv pip install -e .
whisper-typer%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart TD
A["User Hotkey"] --> B["Audio Input Stream"]
C{"Input Mode"}
C -->|Live typing| D["Silence-based Chunking"]
C -->|Full Capture| E["Full Recording Capture"]
D --> F["Transcription Queue (FIFO)"]
E --> F
F --> G["Server API (Transcribe)"]
G --> H["Transcription Service"]
H --> I["Text Output"]
I --> J["Keyboard Typing to Active Window"]
- User presses
Win+Gto toggle recording. - Audio is captured from input stream.
- App checks selected mode:
- Live typing → chunks split by silence windows and enqueued.
- Full Capture → all chunks captured until stop, then enqueued.
- Queue processes each chunk in order (FIFO).
- For each chunk:
- Send audio to server via API.
- Server returns transcribed text.
- Text is typed into the active window via keyboard simulation.
The client runs a global hotkey listener:
- Cmd+G (macOS) — Toggle recording.
- Win+G (Windows) may be reserved by Xbox Game Bar; if it does not work, use the in-app Start Recording button.
- When recording is stopped, the client waits for the transcription and then simulates keyboard typing to insert the text into the currently focused window.
macOS Users:
- You must grant Accessibility permissions to your terminal (e.g., iTerm or Terminal.app) for the auto-typing to work.
- Grant Microphone permissions when prompted.
| State | Color | Meaning |
|---|---|---|
| Idle (server online) | 🟢 Green | Server is running, ready to transcribe |
| Server offline | ⚫ Black | Server is not reachable |
| Recording | 🔴 Red | Audio is being captured |
| Processing | 🟣 Purple | Transcribing audio |
- OS: Windows, macOS, or Linux
- Python: 3.10+
- Package manager: uv (recommended)
- Docker: Optional, for isolated container deployment
The application stores data in ~/.whisper-typer/ by default. You can customize settings using a .env file in the project root:
WHISPER_MODEL: Default model (e.g.,tiny,small,medium).WHISPER_MODELS_DIR: Custom path for model storage. Use an absolute path (for exampleD:/AI/whisper-modelson Windows or/absolute/path/to/modelson Linux/macOS) so the client and server always use the same directory.HF_TOKEN: Hugging Face token for private models.
Contributions are welcome! Please see CONTRIBUTING.md for details.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Sharad Raj Singh Maurya
AI Engineer and Open Source enthusiast.
- GitHub: @sharadcodes
- Project: Whisper Typer
Feel free to reach out for collaborations or to report any issues!