A chat app for end-to-end voice conversations with LLMs that stores the conversation in text and audio format on the server
-
Clone the model repositories to the
modelsdirectory:- Speech-to-text: https://huggingface.co/openai/whisper-medium
- Chat: https://huggingface.co/Qwen/Qwen3-8B-FP8
- Text-to-Speech: https://huggingface.co/kyutai/tts-1.6b-en_fr, https://huggingface.co/kyutai/tts-voices
-
For the python client the requirements in
client/requirements.txtmust be installed. For PyAudio the dev libraries of PortAudio need to be installed (cf.server/stt/Dockerfile)
- Server: Assuming Docker Compose is installed: Execute
./run.shin theserverdirectory. - Python-Client:
python -m client.client(invoice_notedirectory)