[Bug]: Overlapping Audio Causes Misinterpreted User Input in WebSocket Demo App #1817

Programmer-RD-AI · 2025-03-12T01:06:04Z

File Name

gemini/multimodal-live-api/websocket-demo-app

What happened?

When using the WebSocket Demo App in the Gemini Multimodal Live API repository, there is an issue where audio inputs overlap during user speech. This overlapping causes the app to mistakenly treat parts of the audio response as if they were user responses, thereby disrupting the natural conversation flow.

Steps to Reproduce:

Launch the WebSocket Demo App from the generative-ai repository.
Start a conversation by speaking into the microphone.
During the conversation, observe that when the user is speaking, overlapping audio (from system responses) is captured concurrently.
Notice that audio segments received during the overlap are interpreted as user input, which causes the conversation to continue erroneously.

Please let me know if further details are needed to help diagnose this issue.

Code of Conduct

I agree to follow this project's Code of Conduct

holtskinner assigned ZackAkil Mar 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Overlapping Audio Causes Misinterpreted User Input in WebSocket Demo App #1817

[Bug]: Overlapping Audio Causes Misinterpreted User Input in WebSocket Demo App #1817

Programmer-RD-AI commented Mar 12, 2025 •

edited

Loading

[Bug]: Overlapping Audio Causes Misinterpreted User Input in WebSocket Demo App #1817

[Bug]: Overlapping Audio Causes Misinterpreted User Input in WebSocket Demo App #1817

Comments

Programmer-RD-AI commented Mar 12, 2025 • edited Loading

File Name

What happened?

Code of Conduct

Programmer-RD-AI commented Mar 12, 2025 •

edited

Loading