← Back to all sparks
S

Speechmatics

COMMS
Velocity5.0

Speechmatics rolls its Enhanced English model across the stack, citing 89% WER gains on spellouts.

speech-to-textvoice agentsmultilingualmedicalaccuracyenterprise
Current state
Speechmatics is a speech-recognition platform whose last quarter has been a coordinated rollout of its Enhanced Operating Point English model from containers through realtime and batch SaaS. The accuracy story is unusually concrete: 69% relative WER improvement on numbers, 89% on spellouts, 42% on mixed alphanumerics. Alongside the model work, the platform is adding voice-agent ergonomics — End of Utterance detection, prefer_current_speaker, speaker sensitivity — and broadening bilingual coverage.
Where it's heading
Two threads are running in parallel. Vertical depth: domain-specific models starting with medical, plus a growing list of bilingual code-switching pairs (Tagalog, Malay/English, Tamil/English, Mandarin/English, Arabic/English). Horizontal coverage: each model lands in containers first, then realtime SaaS, then batch SaaS, then appliance — containers function as the proving ground and SaaS as the broad rollout vehicle. The release notes also hint at voice agents being the primary use case Speechmatics is optimising for.
Prediction
Expect more vertical-domain Enhanced models beyond medical (legal and finance are the obvious next targets) and a tighter packaging of the voice-agent primitives — End of Utterance, current-speaker locking, low-latency operating points — into something explicitly marketed as a voice-agent SDK or recipe.

Recent moves

  1. 1mo ago

    15.7.0 - Containers

    ⚡ SPARK

    The 15.7.0 container release combines the Enhanced English model rollout with HTTP Batch Transcription on persistent workers — the latter is the architectural piece that makes high-throughput voice-agent and contact-center workloads economical on the same infrastructure.

    View source ↗
  2. 2mo ago

    2026.04.23 Realtime SaaS

    Brings the Enhanced English model to Realtime SaaS, including the low-latency ForceEndOfUtterance accuracy improvements that voice-agent builders care about most.

    View source ↗
  3. 2mo ago

    2026.04.16 Batch SaaS

    Same Enhanced English model lands in Batch SaaS — completing the rollout sequence from containers through both SaaS surfaces.

    View source ↗
  4. 3mo ago

    2026.03.12 - Realtime SaaS

    The first Realtime SaaS appearance of the Enhanced English model with the headline alphanumeric accuracy gains — the table of relative WER improvements (69% on numbers, 89% on spellouts) is the version of the story Speechmatics keeps repeating because customers are asked to validate it themselves.

    View source ↗
  5. 3mo ago

    2026.03.09 - Batch SaaS

    Custom-dictionary throughput improvement for short audio plus the Enhanced English model in Batch SaaS — the dictionary speedup quietly lowers the cost of shipping vocabulary-heavy verticals like medical or legal.

    View source ↗
  6. 3mo ago

    1.2.0 Realtime Kubernetes — Redis repopulation and CVE patches

    Realtime Kubernetes 1.2.0 is platform plumbing — Redis repopulation, custom inference recipe cost generation, container version bump, CVE patches. Operationally relevant for SREs running self-hosted, but no user-visible capability change.

    View source ↗