GitHub - old4ever/transcription-api

Go HTTP Audio Server

This is a rudimentary Go HTTP server for recording, transcribing, and translating audio.

Features

Start audio recording using pw-record.
Stop audio recording.
Transcribe audio using the OpenAI Whisper API.
Translate audio using the OpenAI ChatCompletion API.

Requirements

Linux
Go
pw-record (PipeWire audio recorder)
OpenAI API Key (for Whisper and ChatCompletion models)

Installation

Clone the repository.
Install dependencies:
```
go mod tidy
```

Set up your environment variables:

Create a .env file in the project root.

Add your OpenAI API keys to the .env file:

OPENAI_WHISPER_API_KEY=<YOUR_OPENAI_WHISPER_API_KEY>
OPENAI_TRANSCRIBE_API_KEY=<YOUR_OPENAI_CHATCOMPLETION_API_KEY>

Usage

Run the server:
```
go run main.go
```
The server will start on port 5757.
API Endpoints:
- POST /audio/start: Starts audio recording.
- Response:
  - ```
  {"id": "<process_id>", "message": "Recording started", "file": "<output_file>"}
```
- POST /audio/stop?id=<process_id>: Stops audio recording.
  - Query Parameter:
    - id: The process ID returned by /audio/start.
  - Response:
    - {"message": "Recording stopped", "file": "<file_name>"}
- POST /audio/transcribe?filename=<filename>&lang=<language_code>: Transcribes audio using OpenAI Whisper.
  - Query Parameters:
    - filename: The path to the audio file to transcribe.
    - lang (Optional): The language of the audio (e.g., en for English, ru for Russian). If the language is not valid, it will be ignored.
  - Response:
    - {"message": "<transcription_text>"}
- POST /audio/translate?input=<input_text>&prompt=<prompt_text>: Translates audio using OpenAI ChatCompletion.
  - Query Parameters:
    - input: The input text to translate.
    - prompt: The prompt to guide the translation model.
  - Response:
    - {"message": "<translation_text>"}

CORS Configuration

The server is configured to allow requests from http://localhost:3000. You can adjust the AllowOrigins configuration in main.go if needed.

Notes

The server uses pw-record for audio recording, ensure it is correctly configured in your system.
The removeTempFiles function is currently not working as expected and may require adjustments.
Error handling is rudimentary and logging could be improved.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
adjust-mic-volume.sh		adjust-mic-volume.sh
go.mod		go.mod
go.sum		go.sum
main.go		main.go
remove-recordings.sh		remove-recordings.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Go HTTP Audio Server

Features

Requirements

Installation

Usage

CORS Configuration

Notes

License

About

Uh oh!

Releases

Packages

Languages

License

old4ever/transcription-api

Folders and files

Latest commit

History

Repository files navigation

Go HTTP Audio Server

Features

Requirements

Installation

Usage

CORS Configuration

Notes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages