Skip to content

TranscribeASR (TASR): Local audio-to-text CLI, no internet, max privacy for sensitive content. 音频转文本工具。本地模型,无需联网,适合处理敏感内容。

License

Notifications You must be signed in to change notification settings

ai-cafe/transcribe-asr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TranscribeASR

TranscribeASR (TASR, pronounced "Tay-ser") is a simple CLI based on ASR technology, focused on quickly and accurately converting audio files into text, helping users improve their work efficiency.

Installation

Install via pip:

pip install tasr

Dependencies

Install dependencies:

pip install -r requirements.txt

Install Demucs (Optional, only required for noise reduction):

pip install demucs

The model files will be automatically downloaded to the local machine upon first use.

Usage

Basic Usage

tasr --input <audio_file_path> [--output <output_text_file_path>] [--language <language>] [--denoise]
Parameter Required Description
--input Yes Path to the input audio file.
--output No Path to the output text file. If not specified, it defaults to the same name as the input audio file with a .txt extension.
--language No Specify the language for recognition (e.g., 'en', 'zh', 'yue', 'ja', 'ko'). Default is 'auto' (automatic detection).
--denoise No Whether to perform noise reduction on the audio. Disabled by default.

Examples

Basic Speech Recognition

Perform speech recognition on the audio file example.wav and save the result as example.txt:

tasr --input example.wav

Specify Output File

Perform speech recognition on the audio file example.wav and save the result as output.txt:

python cli.py --input example.wav --output output.txt

Notes

Supported audio formats depend on the capabilities of soundfile and demucs, typically including .wav, .flac, .mp3, etc. For best results with noise reduction, it is recommended to use .wav format.

About

TranscribeASR (TASR): Local audio-to-text CLI, no internet, max privacy for sensitive content. 音频转文本工具。本地模型,无需联网,适合处理敏感内容。

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages