STT (Speech-to-Text) System with Faster-Whisper

A GPU-accelerated speech recognition system with server/client architecture.
Press hotkeys to record → transcribe → output to clipboard/auto-type.

🚀 Quick Start (Both Platforms)

1. Conda Environment (Python 3.11)

conda create -n stt python=3.11
conda activate stt

2. GPU Setup (Skip for CPU-only)

conda install -c "nvidia/label/cuda-12.8.0" cudnn  # Must run BEFORE pip installs!
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH

3. Install Dependencies

pip install -r requirements.txt

Linux Setup

Server Launch

python stt_server.py --start

Or if you want to run it on second GPU

CUDA_VISIBLE_DEVICES="1" python stt_server.py --start

Client Setup

python stt_client.py --configure  # Set hotkey (default: Ctrl+Alt+Space)
python stt_client.py --start

Linux-Specific Notes:

Install xdotool for auto-typing:
```
sudo apt install xdotool
```

Audio troubleshooting:

sudo apt install libportaudio2  # If sounddevice fails

MacOS Setup

Server Launch (Same as Linux)

Client Setup

python stt_client_mac.py --configure  # Set hotkey (default: Cmd+Space)
python stt_client_mac.py --start

Mac-Specific Requirements:

brew install portaudio  # For audio input

Essential Permissions:

Enable in System Settings > Privacy & Security > Accessibility
Add your Python interpreter (from which python)

⚙️ Configuration

Edit the YAML files to customize:

~/.config/stt_server/config.yaml   # Server settings
~/.config/stt_client/config.yaml   # Linux client
~/.config/stt_client/config.yaml   # Mac client (same path)

Key Options:

# Server
use_gpu: true
compute_type: "float16"  # int8|float16|float32
model: "large-v3-turbo"

# Client
hotkey: "<ctrl>+<alt>+<space>"  # Linux
hotkey: "<cmd>+<space>"         # Mac
output_mode: "both"  # clipboard|type|both

📋 Requirements

Hardware

NVIDIA GPU recommended (CUDA 12.8+)
Microphone

`requirements.txt`

fastapi>=0.95.2
uvicorn>=0.21.1
faster-whisper>=0.7.1
torch>=2.0.0
sounddevice>=0.4.6
numpy>=1.24.3
pyyaml>=6.0
pynput>=1.7.6
pyperclip>=1.8.2
requests>=2.28.2

🔧 Troubleshooting

Issue	Solution
CUDA errors	Verify `nvcc --version` matches Conda's CUDA
"Invalid handle"	Reinstall `cudnn` via Conda before other packages
Auto-type fails	On Mac: check Accessibility permissions On Linux: install `xdotool`
Low GPU usage	Try `compute_type="int8"` or smaller model

🏗️ Architecture

stt_server.py        # FastAPI service (GPU processing)
stt_client.py        # Linux hotkey client
stt_client_mac.py    # Mac-optimized client

Key Improvements:

Platform Separation: Clear Linux/Mac sections with OS-specific instructions
Conda-First Approach: GPU setup instructions precede pip installs
Visual Enhancements: Badges, tables, and clean formatting
Problem-Solution Pairs: Structured troubleshooting table
Configuration Highlight: Important YAML options shown inline
Architecture Overview: Simple filesystem structure explanation

Choose this version if you want:

Faster onboarding with platform-specific instructions
Emphasis on Conda for GPU support
Visual clarity through badges and tables
Quick troubleshooting reference

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt
requirements_mac.txt		requirements_mac.txt
stt_client.py		stt_client.py
stt_client_mac.py		stt_client_mac.py
stt_server.py		stt_server.py
stt_shortcut.py		stt_shortcut.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

STT (Speech-to-Text) System with Faster-Whisper

🚀 Quick Start (Both Platforms)

1. Conda Environment (Python 3.11)

2. GPU Setup (Skip for CPU-only)

3. Install Dependencies

Linux Setup

Server Launch

Client Setup

Linux-Specific Notes:

MacOS Setup

Server Launch (Same as Linux)

Client Setup

Mac-Specific Requirements:

Essential Permissions:

⚙️ Configuration

Key Options:

📋 Requirements

Hardware

`requirements.txt`

🔧 Troubleshooting

🏗️ Architecture

Key Improvements:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Teachings/stt

Folders and files

Latest commit

History

Repository files navigation

STT (Speech-to-Text) System with Faster-Whisper

🚀 Quick Start (Both Platforms)

1. Conda Environment (Python 3.11)

2. GPU Setup (Skip for CPU-only)

3. Install Dependencies

Linux Setup

Server Launch

Client Setup

Linux-Specific Notes:

MacOS Setup

Server Launch (Same as Linux)

Client Setup

Mac-Specific Requirements:

Essential Permissions:

⚙️ Configuration

Key Options:

📋 Requirements

Hardware

requirements.txt

🔧 Troubleshooting

🏗️ Architecture

Key Improvements:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`requirements.txt`

Packages