← Back to all sparks
G

Groq

INFRA · APIS
Velocity4.2

Ultra-fast AI inference API for running LLMs using custom hardware.

Layering built-in tools and enterprise model SKUs onto the LPU inference platform after the MCP push.

ai-inferencelpumodel-hostingbuilt-in-toolsenterprise-modelstts
Current state
Groq is in steady cadence after the late-2025 push that brought MCP support, MCP Connectors for Google Workspace, GPT-OSS-Safeguard 20B, and prompt caching across the GPT-OSS lineup. Recent work focuses on built-in tooling (browser search for GPT OSS), expanding the enterprise model shelf (MiniMax M2.5, Qwen3-VL 32B), TTS voice expansion for the Orpheus Arabic Saudi model, and SDK stability fixes after the Q4 GA.
Where it's heading
The platform is widening, not pivoting. The strategic story — fast LPU inference with OpenAI-compatible APIs, MCP for tool use, and a curated model shelf — is set; current work is filling in the secondary surfaces (built-in tools, voice variants, enterprise gating). Enterprise-only model availability is becoming a regular pattern, suggesting Groq is building out a tiered offering rather than continuing pure self-serve.
Prediction
Expect Browser Search to extend beyond GPT OSS to other tool-use models, more frontier model partnerships landing on enterprise-only first, and additional MCP Connectors beyond the Google Workspace set. A formal premium tier announcement is plausible in the next quarter.

Recent moves

  1. 2mo ago

    Browser Search (GPT OSS Models)

    Browser Search lands as a built-in tool for GPT OSS models, complementing the earlier Remote MCP and code-execution primitives. Continues Groq's pattern of giving developers first-party tools without forcing them through external MCP servers, while still keeping the OpenAI-compatible Responses API as the wire format.

    View source ↗
  2. 2mo ago

    MiniMax M2.5 and Qwen3-VL 32B Instruct (Enterprise)

    Two enterprise-only models added to GroqCloud: MiniMax M2.5 as a general-purpose enterprise model and Qwen3-VL 32B Instruct as a vision-language option. Continues the pattern of frontier and specialty Asian-lab models landing on Groq behind enterprise gating before general availability.

    View source ↗
  3. 2mo ago

    Python SDK v1.2.0 and TypeScript SDK v1.1.2 stability release

    Quarterly SDK update following the Q4 v1.0.0 GA: Python SDK v1.2.0 and TypeScript SDK v1.1.2 land with stability fixes, hardcoded query-param preservation when merging with user params, and file-data parameter handling. Maintenance work, not user-visible direction.

    View source ↗
  4. 2mo ago

    New Voices for Orpheus Arabic Saudi

    Two new voices (Abdullah, now default, and Aisha) added to the Orpheus Arabic Saudi TTS model, bringing the total to six. Small but signals continued investment in non-English speech after the platform-wide migration from PlayAI to Orpheus earlier in the quarter.

    View source ↗
  5. 2mo ago

    Built-in tools documented for Compound models

    A documentation entry for built-in tools on the Compound model surface — minimal content in the changelog feed but signals Groq is formalizing first-party tools as a discoverable concept alongside Remote MCP. Worth watching for the next wave of tool additions.

    View source ↗
  6. 4mo ago

    Platform-wide Migration from PlayAI to Orpheus TTS

    The platform completes its migration from PlayAI to Canopy Labs' Orpheus TTS, retiring playai-tts and playai-tts-arabic per the December 2025 deprecation. Orpheus brings vocal-direction controls, faster inference, and improved audio quality. A clean execution of a planned model swap rather than a directional change.

    View source ↗