ONNX Runtime
Cross-platform inference and training engine for ONNX-format machine-learning models.
ONNX Runtime is doing the unglamorous work: C++20, CUDA 12, free-threaded Python, EP plugin API.
◆Recent moves
- 22d ago
ONNX Runtime v1.25.1
v1.25.1 adds operators required for Qwen3.5 support (LinearAttention, CausalConvState, RotaryEmbedding, RMSNorm), enables Qwen3.5 on the WebGPU execution provider, and bumps Reshape and Transpose to newer opset versions. Routine new-model integration on top of the 1.25.0 platform reset.
View source ↗ - 29d ago
ONNX Runtime v1.25.0
⚡ SPARKv1.25.0 raises the floor: C++20 required to build from source, CUDA minimum bumped to 12.0 (11.x dropped), ArmNN EP removed, ONNX upgraded to 1.21.0. The biggest platform-compatibility break in recent releases.
View source ↗ - 2mo ago
ONNX Runtime v1.24.4
v1.24.4 patch: PCI-bus GPU fallback for containerized Linux environments where nvidia-drm isn't loaded, plus Plugin EP null-deref and MetaDef-ID conflict fixes. Routine bug-fix maintenance on the 1.24 line.
View source ↗ - 2mo ago
ONNX Runtime v1.24.3
v1.24.3 ships a batch of security fixes — heap out-of-bounds reads/writes in GatherCopyData, RoiAlign, Lora Adapter loading, and Resize — plus GatherND division-by-zero and external-data-path validation hardening. Maintenance-line release, but with real CVE-class surface.
View source ↗ - 3mo ago
ONNX Runtime v1.24.2
v1.24.2 patches NuGet native-library loading on Linux/macOS, Java/Jar testing on macOS ARM64, an ArrayFeatureExtractor OOB read, and adds boundary checks for SparseTensorProto conversion. Cross-platform stability work.
View source ↗ - 3mo ago
ONNX Runtime v1.24.1
⚡ SPARKv1.24.1 drops Python 3.10 wheels, adds Python 3.14 plus free-threaded (PEP 703) builds for 3.13t/3.14t on Linux, drops x86_64 macOS/iOS binaries (min macOS 14.0), and ships the Execution Provider Plugin API with kernel-based EPs, weight pre-packing, and EP Context model support. Major infrastructure shift on the Python side.
View source ↗