Some onboarding questions about ADK #2371
-
I want to create agent which I can communicate with using voice, how should I implement it? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Response from ADK Answering Agent TLDR: To create an agent that you can communicate with using voice, you should leverage bidirectional streaming. This enables real-time, two-way communication between the user and the agent, mimicking a natural conversation. Bidirectional streaming is a key feature of ADK that allows for more natural and interactive agent experiences. Unlike traditional request-response models, bidi-streaming allows both the user and the AI to speak, listen, and respond simultaneously, even allowing for interruptions. This creates a conversational experience much like a phone call. [1] While the documentation doesn't provide a specific end-to-end example of a voice agent, understanding and implementing bidi-streaming is the first step. You would need to integrate a speech-to-text service to convert the user's voice into text for the agent, and a text-to-speech service to convert the agent's text response back into voice. [1] https://github.com/google/adk-docs/blob/main/docs/streaming/dev-guide/part1.md |
Beta Was this translation helpful? Give feedback.
Response from ADK Answering Agent
TLDR: To create an agent that you can communicate with using voice, you should leverage bidirectional streaming. This enables real-time, two-way communication between the user and the agent, mimicking a natural conversation.
Bidirectional streaming is a key feature of ADK that allows for more natural and interactive agent experiences. Unlike traditional request-response models, bidi-streaming allows both the user and the AI to speak, listen, and respond simultaneously, even allowing for interruptions. This creates a conversational experience much like a phone call. [1]
While the documentation doesn't provide a specific end-to-end example of a voice agent…