This is a rudimentary Go HTTP server for recording, transcribing, and translating audio.
- Start audio recording using
pw-record. - Stop audio recording.
- Transcribe audio using the OpenAI Whisper API.
- Translate audio using the OpenAI ChatCompletion API.
- Linux
- Go
pw-record(PipeWire audio recorder)- OpenAI API Key (for Whisper and ChatCompletion models)
-
Clone the repository.
-
Install dependencies:
go mod tidy
-
Set up your environment variables:
-
Create a
.envfile in the project root. -
Add your OpenAI API keys to the
.envfile:OPENAI_WHISPER_API_KEY=<YOUR_OPENAI_WHISPER_API_KEY> OPENAI_TRANSCRIBE_API_KEY=<YOUR_OPENAI_CHATCOMPLETION_API_KEY>
-
- Run the server:
The server will start on port
go run main.go
5757. - API Endpoints:
POST /audio/start: Starts audio recording.- Response:
-
{"id": "<process_id>", "message": "Recording started", "file": "<output_file>"}
-
POST /audio/stop?id=<process_id>: Stops audio recording.- Query Parameter:
id: The process ID returned by/audio/start.
- Response:
-
{"message": "Recording stopped", "file": "<file_name>"}
-
- Query Parameter:
POST /audio/transcribe?filename=<filename>&lang=<language_code>: Transcribes audio using OpenAI Whisper.- Query Parameters:
filename: The path to the audio file to transcribe.lang(Optional): The language of the audio (e.g.,enfor English,rufor Russian). If the language is not valid, it will be ignored.
- Response:
-
{"message": "<transcription_text>"}
-
- Query Parameters:
POST /audio/translate?input=<input_text>&prompt=<prompt_text>: Translates audio using OpenAI ChatCompletion.- Query Parameters:
input: The input text to translate.prompt: The prompt to guide the translation model.
- Response:
-
{"message": "<translation_text>"}
-
- Query Parameters:
The server is configured to allow requests from http://localhost:3000. You can adjust the AllowOrigins configuration in main.go if needed.
- The server uses
pw-recordfor audio recording, ensure it is correctly configured in your system. - The
removeTempFilesfunction is currently not working as expected and may require adjustments. - Error handling is rudimentary and logging could be improved.