← Back to all sparks
E

ElevenLabs

INFRA · APIS
Velocity7.5

AI voice generation platform for creating realistic text-to-speech and voice cloning.

ElevenLabs widens from TTS into a full voice-agent and music platform

voice-aiagentsgenerative-musictelephonyapisdk
Current state
ElevenLabs is shipping on two fronts: new foundational capabilities, a Music v2 model with chunk-based composition and Speech Engine, which adds real-time voice to a developer's own agent or LLM, and a relentless cadence of ElevenAgents API work (Exotel telephony, workflow-aware transfers, new LLM options, SIP logs, knowledge-base editing) plus deprecations of v1 TTS/STT models and weekly SDK regenerations.
Where it's heading
The company is consolidating into a voice-AI platform: owning the model layer (music, TTS, STT, turn detection) while making ElevenAgents and Speech Engine the programmable runtime others build conversational voice on. Aggressive deprecation signals confidence in pushing customers to current models.
Prediction
Expect Speech Engine and Music v2 to mature with more controls, continued ElevenAgents telephony and workflow depth, and further old-model sunsets.

Recent moves

  1. 12d ago

    Introducing Music v2

    ⚡ SPARK

    Music v2 lands in the API with chunk-based composition plans for finer control over structure and arrangement than the prompt-based v1. Expands ElevenLabs' model layer beyond voice into structured music generation.

  2. 19d ago

    Text to Speech

    A dense API update: v1 TTS/STT models slated for July 9 removal, turn-model selection, soft-timeout filler messages, an ASR provider default switch to scribe_realtime, and zero-retention text-to-dialogue. Migration-forcing maintenance plus ElevenAgents depth.

  3. 26d ago

    ElevenAgents

    Exotel joins Twilio and SIP as a first-class telephony provider, with workflow-aware agent transfers, repeatable agent tests, and a 5 GB STT upload limit. Telephony and testing depth for ElevenAgents.

  4. 1mo ago

    Introducing Speech Engine

    ⚡ SPARK

    Speech Engine adds real-time voice to a developer's own chat agent or LLM, with ElevenLabs handling STT, turn-taking, TTS and playback while the customer's server owns the logic. A new runtime distinct from fully-hosted ElevenAgents.

  5. 1mo ago

    ElevenAgents

    Agent version metadata, text-only conversation filters, new Gemini and Qwen LLM options, and nullable prompt temperature. Routine ElevenAgents configuration expansion.

  6. 1mo ago

    ElevenAgents

    SIP signaling logs for debugging, in-place knowledge-base document editing with RAG chunk listing, SMS conversation metadata, API-key IP allowlisting, and new GPT-5.4 options. Broad ElevenAgents and workspace additions.