A free, open source, and extensible speech-to-text application that works completely offline.
Handy is a cross-platform desktop application built with Tauri (Rust + React/TypeScript) that provides simple, privacy-focused speech transcription. Press a shortcut, speak, and have your words appear in any text field—all without sending your voice to the cloud.
Handy was created to fill the gap for a truly open source, extensible speech-to-text tool. As stated on handy.computer:
- Free: Accessibility tooling belongs in everyone's hands, not behind a paywall
- Open Source: Together we can build further. Extend Handy for yourself and contribute to something bigger
- Private: Your voice stays on your computer. Get transcriptions without sending audio to the cloud
- Simple: One tool, one job. Transcribe what you say and put it into a text box
Handy isn't trying to be the best speech-to-text app—it's trying to be the most forkable one.
- Press a configurable keyboard shortcut to start/stop recording (or use push-to-talk mode)
- Speak your words while the shortcut is active
- Release and Handy processes your speech using Whisper
- Get your transcribed text pasted directly into whatever app you're using
The process is entirely local:
- Silence is filtered using VAD (Voice Activity Detection) with Silero
- Transcription uses your choice of models:
- Whisper models (Small/Medium/Turbo/Large) with GPU acceleration when available
- Parakeet V3 - CPU-optimized model with excellent performance and automatic language detection
- Works on Windows, macOS, and Linux
- Download the latest release from the releases page or the website
- Install the application following platform-specific instructions
- Launch Handy and grant necessary system permissions (microphone, accessibility)
- Configure your preferred keyboard shortcuts in Settings
- Start transcribing!
For detailed build instructions including platform-specific requirements, see BUILD.md.
Handy is built as a Tauri application combining:
- Frontend: React + TypeScript with Tailwind CSS for the settings UI
- Backend: Rust for system integration, audio processing, and ML inference
- Core Libraries:
whisper-rs
: Local speech recognition with Whisper modelstranscription-rs
: CPU-optimized speech recognition with Parakeet modelscpal
: Cross-platform audio I/Ovad-rs
: Voice Activity Detectionrdev
: Global keyboard shortcuts and system eventsrubato
: Audio resampling
This project is actively being developed and has some known issues. We believe in transparency about the current state:
- macOS (both Intel and Apple Silicon)
- x64 Windows
- x64 Linux
The following are recommendations for running Handy on your own machine. If you don't meet the system requirements, the performance of the application may be degraded. We are working on improving the performance across all kinds of computers and hardware.
For Whisper Models:
- macOS: M series Mac, Intel Mac
- Windows: Intel, AMD, or NVIDIA GPU
- Linux: Intel, AMD, or NVIDIA GPU
- Ubuntu 22.04, 24.04
For Parakeet V3 Model:
- CPU-only operation - runs on a wide variety of hardware
- Minimum: Intel Skylake (6th gen) or equivalent AMD processors
- Performance: ~5x real-time speed on mid-range hardware (tested on i5)
- Automatic language detection - no manual language selection required
- Check existing issues at github.com/cjpais/Handy/issues
- Fork the repository and create a feature branch
- Test thoroughly on your target platform
- Submit a pull request with clear description of changes
- Join the discussion - reach out at [email protected]
The goal is to create both a useful tool and a foundation for others to build upon—a well-patterned, simple codebase that serves the community.
- Handy CLI - The original Python command-line version
- handy.computer - Project website with demos and documentation
MIT License - see LICENSE file for details.
- Whisper by OpenAI for the speech recognition model
- whisper.cpp and ggml for amazing cross-platform whisper inference/acceleration
- Silero for great lightweight VAD
- Tauri team for the excellent Rust-based app framework
- Community contributors helping make Handy better
"Your search for the right speech-to-text tool can end here—not because Handy is perfect, but because you can make it perfect for you."