ONNX Runtime vs Google AI
Side-by-side trajectory, velocity, and editorial themes.
ONNX Runtime is doing the unglamorous work: C++20, CUDA 12, free-threaded Python, EP plugin API.
ONNX Runtime is mid-platform-modernization. v1.25.0 raised the build floor to C++20 and CUDA 12.0, removed the ArmNN execution provider, and bumped ONNX to 1.21. v1.24.1 made the parallel move on the Python side — dropped 3.10, added 3.14 and free-threaded (PEP 703) variants, and introduced the EP Plugin API for dynamically loaded execution providers. Between those structural releases, the 1.24.x patch line has been heavily security-focused: multiple heap out-of-bounds fixes (GatherCopyData, RoiAlign, Lora Adapters, ArrayFeatureExtractor). New model and operator support continues — Qwen3.5 across LinearAttention/CausalConvState/RMSNorm/RotEMB, including WebGPU.
The runtime is repositioning for the next wave: free-threaded Python lets ML workloads finally escape the GIL on CPU paths, the EP Plugin API decouples hardware-vendor execution providers from the runtime release cycle, and the WebGPU EP keeps adding frontier-model coverage. The cost is sharp deprecation — C++20, CUDA 12, no more Python 3.10, no more x86_64 macOS — but this is the pattern of a project clearing technical debt to support the next two years of GPU-vendor diversity and edge inference.
Expect more vendor execution providers (Qualcomm QNN, Apple Neural Engine, Intel) to migrate onto the new Plugin EP API in the next two releases, and continued security-patch cadence on 1.24.x for users who can't move to 1.25 yet. WebGPU EP coverage will keep tracking new model architectures — Qwen 3.5 today, the next frontier MoE class tomorrow.
I/O 2026: Gemini 3.5 lands as Google bets the stack on agentic action.
Google's consumer AI surface is mid-I/O 2026 announcement burst. Gemini 3.5 ships as a frontier model framed around 'action' rather than chat, alongside a $100/mo AI Ultra tier, expanded AI Mode in Search, voice-native Workspace tools, and a Beam group-meeting experiment. The framing across launches is consistent: Gemini is no longer positioned as a model but as an agent embedded in Search, Workspace, and devices.
The product line is converging on two arcs: agentic action and tiered monetization. Capability releases (Gemini 3.5, agentic Workspace, AI Mode) are arriving in lockstep with a steeper subscription ladder, and Search is being openly reframed away from the keyword model. Side bets like Beam and community investments suggest Google is willing to fund longer-horizon hardware and brand work while the model layer carries the revenue narrative.
Expect Gemini 3.5 'action' capabilities to surface in Workspace agents and Android over the next two quarters, with AI Ultra positioned as the gating tier. Watch for a developer-facing agent runtime to follow the consumer rollout.
See more alternatives to ONNX Runtime →
See more alternatives to Google AI →