A simple realtime speech-to-text transcription from your microhpone using Whisper.
This uses the idea from the whisper_real_time to record audio in a background thread and concatenating the raw bytes over multiple recordings.
Also, my fork of whisper whisper-lang-selection is used to support a selection of languages to detect in your speech.
Create a virtual environment using the following command:
python -m venv venvAnd activate the virtual environment:
source venv/bin/activate # for linux
./venv/Scripts/Activate # for WindowsFinally, install the required packages using:
pip install -r requirements.txtNote that you will also need to install ffmpeg to be installed on your system.
You can run the realtime demo using the following command:
python transcribe.py