Real-time speech processing system with dynamic LangGraph agent responses using Kafka communication.
- Real-time Speech Recognition
- Voice Activity Detection (VAD)
- Multi-device microphone support
- Configurable audio processing
- VDSA Agent System
- LangGraph workflow orchestration
- Code generation & execution
- Docker-based sandboxing
- Distributed Architecture
- Kafka message streaming
- Modular microservices
- Horizontal scalability
- Conda (Miniconda/Anaconda)
- Python 3.10
- Local Kafka instance
- NVIDIA GPU (recommended) or Apple Silicon
- Microphone
-
Clone Repository
git clone https://github.com/Teachings/vdsa.git cd vdsa
-
Create Conda Environment
conda create -n vdsa python=3.10 conda activate vdsa
-
Install Dependencies
pip install -r requirements.txt
-
Start Kafka
docker-compose up -d
config.yml
:
device: "cuda" # "mps" or "cpu"
voice-model: "large-v3-turbo"
kafka:
broker: "localhost:9092"
topic_vdsa: "vdsa.agent.requests"
topic_response: "vdsa.agent.response"
conda activate vdsa
python launcher.py
Terminal 1 - Speech Recognition:
conda activate vdsa
python stt/main.py
Terminal 2 - Agent System:
conda activate vdsa
python langgraph_dynamic_agent/dsa.py
┌────────────┐ ┌─────────────┐
│ STT Engine │ │ VDSA Agent │
└─────┬──────┘ Kafka └──────┬──────┘
│ │
▼ ▼
Microphone Input Agent Responses
-
Launcher:
logs/stt.log
logs/vdsa.log
-
Manual Mode:
tail -f stt/debug.log tail -f langgraph_dynamic_agent/agent.log