← Back to all sparks
L

LiveKit Agents

AI-ASSISTANTS
Velocity6.3

Framework for building real-time voice and multimodal AI agents over WebRTC.

LiveKit Agents makes async tools first-class as its voice-agent framework matures

voice-agentsreal-timeasync-toolsturn-detectionstt-ttsopen-source
Current state
LiveKit Agents is an open-source framework for building real-time voice AI agents, shipping on a fast point-release train via GitHub. The recent window pairs genuine capability work — first-class asynchronous tools and a v1.0 turn detector — with a steady flow of provider/model integrations (AssemblyAI, Gemini, Soniox, fishaudio) and routine fixes.
Where it's heading
The framework is hardening the hard parts of voice agents: knowing when to respond (turn detection), staying responsive during long tool calls (async tools, filler phrases), and supporting an ever-wider catalog of STT/TTS/LLM providers. It's moving from breadth of integrations toward depth in conversational UX.
Prediction
Expect continued provider integrations plus more conversational-quality work — turn detection, barge-in, and async tool ergonomics — as the v1.6 line stabilizes.

Recent moves

  1. 10d ago

    1.6.4: Protoface avatar plugin + EoT fixes

    Point release 1.6.4 adds a Protoface avatar plugin and expands the hotel-receptionist example, alongside end-of-turn retry and STT pipeline fixes. Incremental, with one new integration.

    View source ↗
  2. 12d ago

    1.6.3: STT and end-of-turn fixes

    Release 1.6.3 is a small fixes-only point release (option exposure, tool-choice handling, end-of-turn timeout restore). No new capability.

    View source ↗
  3. 15d ago

    1.6.2: new AssemblyAI, Gemini, Soniox STT/TTS options

    Release 1.6.2 brings a wave of provider integrations — AssemblyAI universal-3-5-pro, Gemini 3.1 flash TTS, Soniox stt-rt-v5, fishaudio prosody params. Broadened STT/TTS coverage.

    View source ↗
  4. 15d ago

    1.6.1: Turn Detector v1.0 (audio + text)

    Release 1.6.1 introduces Turn Detector v1.0, using both audio and text semantics to time agent responses, plus many voice and IPC fixes. A real conversational-quality milestone bundled into a point release.

    View source ↗
  5. 23d ago

    1.6.0: first-class asynchronous tools

    ⚡ SPARK

    Release 1.6.0 makes asynchronous tools first-class: a long-running tool can hand control back to the LLM before it finishes, streaming progress updates and acoustic filler phrases so the agent isn't silent. A structural change to how developers write voice-agent tools.

    View source ↗
  6. 27d ago

    1.5.19 release candidate (no notes)

    A release candidate (1.5.19.rc1) with no substantive notes beyond co-author attribution. Pre-release plumbing.

    View source ↗