
Use this SDK to add realtime video, audio and data features to your ESP32 projects. By connecting to LiveKit Cloud or a self-hosted server, you can quickly build applications such as multi-modal AI, live streaming, or video calls with minimal setup.
Warning
This SDK is currently in Developer Preview mode and not ready for production use. There will be bugs and APIs may change during this period.
- Supported chipsets: ESP32-S3 and ESP32-P4
- Bidirectional audio: Opus encoding, acoustic echo cancellation (AEC)
- Bidirectional video: video support coming soon
- Real-time data: data packets, remote method calls (RPC)
One of the best ways to get started with LiveKit is by reviewing the examples and choosing one as a starting point for your project:
Conversational AI voice agent that interacts with hardware based on user requests.
In your application's IDF component manifest, add LiveKit as a Git dependency:
dependencies:
livekit:
git: https://github.com/livekit/client-sdk-esp32.git
path: components/livekit
version: <current version tag>
Please be sure to pin to a specific version tag, as subsequent 0.x.x releases may have breaking changes. In the future, this SDK will be added to the ESP component registry.
With LiveKit added as a dependency to your application, include the LiveKit header and invoke
livekit_system_init
early in your application's main function:
#include "livekit.h"
void app_main(void)
{
livekit_system_init();
// Your application code...
}
LiveKit for ESP32 puts your application in control of the media pipeline; your application configures a capturer and/or renderer and provides their handles when creating a room.
- Required for rooms which will publish media tracks
- Created using the Espressif esp_capture component
- Capture audio capture over I2S, video from MIPI CSI or DVI cameras
- After configuration, you will provide the
esp_capture_handle_t
when creating a room
- Required for rooms which will subscribe to media tracks
- Created using the Espressif av_render component
- Playback audio over I2S, video on LCD displays supported by esp_lcd
- After configuration, you will provide the
av_render_handle_t
when creating a room
Please refer to the examples in this repository, which support many popular development boards via the Espressif codec_board component.
Create a room object, specifying your capturer, renderer, and handlers for room events:
static livekit_room_handle_t room_handle = NULL;
livekit_room_options_t room_options = {
.publish = {
.kind = LIVEKIT_MEDIA_TYPE_AUDIO,
.audio_encode = {
.codec = LIVEKIT_AUDIO_CODEC_OPUS,
.sample_rate = 16000,
.channel_count = 1
},
.capturer = my_capturer
},
.subscribe = {
.kind = LIVEKIT_MEDIA_TYPE_AUDIO,
.renderer = my_renderer
},
.on_state_changed = on_state_changed,
.on_participant_info = on_participant_info
};
if (livekit_room_create(&room_handle, &room_options) != LIVEKIT_ERR_NONE) {
ESP_LOGE(TAG, "Failed to create room object");
}
This example does not show all available fields in room options—please refer to the API reference for an extensive list.
Typically, you will want to create the room object early in your application's lifecycle, and connect/disconnect as necessary based on user interaction.
With a room room handle, connect by providing a server URL and token:
livekit_room_connect(room_handle, "<your server URL>", "<token>");
The connect method is asynchronous; use your on_state_changed
handler provided in room options
to get notified when the connection is established or fails (e.g. due to an expired token, etc.).
Once connected, media exchange will begin:
- If a capturer was provided, video and/or audio tracks will be published.
- If a renderer was provided, the first video and/or audio tracks in the room will be subscribed to.
In addition to real-time audio and video, LiveKit offers several methods for exchange real-time data between participants in a room.
Define an RPC handler:
static void get_cpu_temp(const livekit_rpc_invocation_t* invocation, void* ctx)
{
float temp = board_get_temp();
char temp_string[16];
snprintf(temp_string, sizeof(temp_string), "%.2f", temp);
livekit_rpc_return_ok(temp_string);
}
Register the handler on the room to allow it to be invoked by remote participants:
livekit_room_rpc_register(room_handle, "get_cpu_temp", get_cpu_temp);
Tip
In the voice_agent example, RPC is used to allow an AI agent to interact with hardware by defining a series of methods for the agent to invoke.
Publish a user packet containing a raw data payload under a specific topic:
const char* command = "G5 I0 J3 P0 Q-3 X2 Y3";
livekit_payload_t payload = {
.bytes = (uint8_t*)command,
.size = strlen(command)
};
livekit_data_publish_options_t options = {
.payload = &payload,
.topic = "gcode",
.lossy = false,
.destination_identities = (char*[]){ "printer-1" },
.destination_identities_count = 1
};
livekit_room_publish_data(room_handle, &options);
Please refer to the LiveKit Docs for an introduction to the platform and its features, or see the API Reference for specifics about this SDK.
- In some cases, a remote participant leaving the room can lead to a disconnect.
We invite you to join the LiveKit Community Slack to get your questions answered, suggest improvements, or discuss how you can best contribute to this SDK.
LiveKit Ecosystem | |
---|---|
LiveKit SDKs | Browser · iOS/macOS/visionOS · Android · Flutter · React Native · Rust · Node.js · Python · Unity · Unity (WebGL) · ESP32 |
Server APIs | Node.js · Golang · Ruby · Java/Kotlin · Python · Rust · PHP (community) · .NET (community) |
UI Components | React · Android Compose · SwiftUI · Flutter |
Agents Frameworks | Python · Node.js · Playground |
Services | LiveKit server · Egress · Ingress · SIP |
Resources | Docs · Example apps · Cloud · Self-hosting · CLI |