TranscribeASR (TASR, pronounced "Tay-ser") is a simple CLI based on ASR technology, focused on quickly and accurately converting audio files into text, helping users improve their work efficiency.
Install via pip:
pip install tasr
Install dependencies:
pip install -r requirements.txt
Install Demucs (Optional, only required for noise reduction):
pip install demucs
The model files will be automatically downloaded to the local machine upon first use.
tasr --input <audio_file_path> [--output <output_text_file_path>] [--language <language>] [--denoise]
Parameter | Required | Description |
---|---|---|
--input |
Yes | Path to the input audio file. |
--output |
No | Path to the output text file. If not specified, it defaults to the same name as the input audio file with a .txt extension. |
--language |
No | Specify the language for recognition (e.g., 'en' , 'zh' , 'yue' , 'ja' , 'ko' ). Default is 'auto' (automatic detection). |
--denoise |
No | Whether to perform noise reduction on the audio. Disabled by default. |
Perform speech recognition on the audio file example.wav
and save the result as example.txt
:
tasr --input example.wav
Perform speech recognition on the audio file example.wav
and save the result as output.txt
:
python cli.py --input example.wav --output output.txt
Supported audio formats depend on the capabilities of soundfile
and demucs
, typically including .wav
, .flac
, .mp3
, etc.
For best results with noise reduction, it is recommended to use .wav
format.