This is a rudimentary Go HTTP server for recording, transcribing, and translating audio.
- Start audio recording using
pw-record
. - Stop audio recording.
- Transcribe audio using the OpenAI Whisper API.
- Translate audio using the OpenAI ChatCompletion API.
- Linux
- Go
pw-record
(PipeWire audio recorder)- OpenAI API Key (for Whisper and ChatCompletion models)
-
Clone the repository.
-
Install dependencies:
go mod tidy
-
Set up your environment variables:
-
Create a
.env
file in the project root. -
Add your OpenAI API keys to the
.env
file:OPENAI_WHISPER_API_KEY=<YOUR_OPENAI_WHISPER_API_KEY> OPENAI_TRANSCRIBE_API_KEY=<YOUR_OPENAI_CHATCOMPLETION_API_KEY>
-
- Run the server:
The server will start on port
go run main.go
5757
. - API Endpoints:
POST /audio/start
: Starts audio recording.- Response:
-
{"id": "<process_id>", "message": "Recording started", "file": "<output_file>"}
-
POST /audio/stop?id=<process_id>
: Stops audio recording.- Query Parameter:
id
: The process ID returned by/audio/start
.
- Response:
-
{"message": "Recording stopped", "file": "<file_name>"}
-
- Query Parameter:
POST /audio/transcribe?filename=<filename>&lang=<language_code>
: Transcribes audio using OpenAI Whisper.- Query Parameters:
filename
: The path to the audio file to transcribe.lang
(Optional): The language of the audio (e.g.,en
for English,ru
for Russian). If the language is not valid, it will be ignored.
- Response:
-
{"message": "<transcription_text>"}
-
- Query Parameters:
POST /audio/translate?input=<input_text>&prompt=<prompt_text>
: Translates audio using OpenAI ChatCompletion.- Query Parameters:
input
: The input text to translate.prompt
: The prompt to guide the translation model.
- Response:
-
{"message": "<translation_text>"}
-
- Query Parameters:
The server is configured to allow requests from http://localhost:3000
. You can adjust the AllowOrigins
configuration in main.go
if needed.
- The server uses
pw-record
for audio recording, ensure it is correctly configured in your system. - The
removeTempFiles
function is currently not working as expected and may require adjustments. - Error handling is rudimentary and logging could be improved.