Handy

A free, open source, and extensible speech-to-text application that works completely offline.

Handy is a cross-platform desktop application built with Tauri (Rust + React/TypeScript) that provides simple, privacy-focused speech transcription. Press a shortcut, speak, and have your words appear in any text field—all without sending your voice to the cloud.

Why Handy?

Handy was created to fill the gap for a truly open source, extensible speech-to-text tool. As stated on handy.computer:

Free: Accessibility tooling belongs in everyone's hands, not behind a paywall
Open Source: Together we can build further. Extend Handy for yourself and contribute to something bigger
Private: Your voice stays on your computer. Get transcriptions without sending audio to the cloud
Simple: One tool, one job. Transcribe what you say and put it into a text box

Handy isn't trying to be the best speech-to-text app—it's trying to be the most forkable one.

How It Works

Press a configurable keyboard shortcut to start/stop recording (or use push-to-talk mode)
Speak your words while the shortcut is active
Release and Handy processes your speech using Whisper
Get your transcribed text pasted directly into whatever app you're using

The process is entirely local:

Silence is filtered using VAD (Voice Activity Detection) with Silero
Transcription uses your choice of models:
- Whisper models (Small/Medium/Turbo/Large) with GPU acceleration when available
- Parakeet V3 - CPU-optimized model with excellent performance and automatic language detection
Works on Windows, macOS, and Linux

Quick Start

Installation

Download the latest release from the releases page or the website
Install the application following platform-specific instructions
Launch Handy and grant necessary system permissions (microphone, accessibility)
Configure your preferred keyboard shortcuts in Settings
Start transcribing!

Development Setup

For detailed build instructions including platform-specific requirements, see BUILD.md.

Architecture

Handy is built as a Tauri application combining:

Frontend: React + TypeScript with Tailwind CSS for the settings UI
Backend: Rust for system integration, audio processing, and ML inference
Core Libraries:
- whisper-rs: Local speech recognition with Whisper models
- transcription-rs: CPU-optimized speech recognition with Parakeet models
- cpal: Cross-platform audio I/O
- vad-rs: Voice Activity Detection
- rdev: Global keyboard shortcuts and system events
- rubato: Audio resampling

Known Issues & Current Limitations

This project is actively being developed and has some known issues. We believe in transparency about the current state:

Platform Support

macOS (both Intel and Apple Silicon)
x64 Windows
x64 Linux

System Requirements/Recommendations

The following are recommendations for running Handy on your own machine. If you don't meet the system requirements, the performance of the application may be degraded. We are working on improving the performance across all kinds of computers and hardware.

For Whisper Models:

macOS: M series Mac, Intel Mac
Windows: Intel, AMD, or NVIDIA GPU
Linux: Intel, AMD, or NVIDIA GPU
- Ubuntu 22.04, 24.04

For Parakeet V3 Model:

CPU-only operation - runs on a wide variety of hardware
Minimum: Intel Skylake (6th gen) or equivalent AMD processors
Performance: ~5x real-time speed on mid-range hardware (tested on i5)
Automatic language detection - no manual language selection required

How to Contribute

Check existing issues at github.com/cjpais/Handy/issues
Fork the repository and create a feature branch
Test thoroughly on your target platform
Submit a pull request with clear description of changes
Join the discussion - reach out at [email protected]

The goal is to create both a useful tool and a foundation for others to build upon—a well-patterned, simple codebase that serves the community.

Related Projects

Handy CLI - The original Python command-line version
handy.computer - Project website with demos and documentation

License

MIT License - see LICENSE file for details.

Acknowledgments

Whisper by OpenAI for the speech recognition model
whisper.cpp and ggml for amazing cross-platform whisper inference/acceleration
Silero for great lightweight VAD
Tauri team for the excellent Rust-based app framework
Community contributors helping make Handy better

"Your search for the right speech-to-text tool can end here—not because Handy is perfect, but because you can make it perfect for you."

Name		Name	Last commit message	Last commit date
Latest commit History 342 Commits
.cargo		.cargo
.github		.github
.vscode		.vscode
sponsor-images		sponsor-images
src-tauri		src-tauri
src		src
.gitignore		.gitignore
BUILD.md		BUILD.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CRUSH.md		CRUSH.md
LICENSE		LICENSE
README.md		README.md
bun.lockb		bun.lockb
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Handy

Why Handy?

How It Works

Quick Start

Installation

Development Setup

Architecture

Known Issues & Current Limitations

Platform Support

System Requirements/Recommendations

How to Contribute

Sponsors

Related Projects

License

Acknowledgments

About

Uh oh!

Releases 26

Sponsor this project

Uh oh!

Packages

Contributors 14

Uh oh!

Languages

Uh oh!

License

cjpais/Handy

Folders and files

Latest commit

History

Repository files navigation

Handy

Why Handy?

How It Works

Quick Start

Installation

Development Setup

Architecture

Known Issues & Current Limitations

Platform Support

System Requirements/Recommendations

How to Contribute

Sponsors

Related Projects

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 26

Sponsor this project

Uh oh!

Packages 0

Contributors 14

Uh oh!

Languages

Packages