Agents crossed from assistant to actor in devtools this week, with GitHub, Vercel, and Cursor wiring them into CI and review.
The week in devtools
The dominant move this week is agents crossing from assistant to actor. GitHub put Agentic Workflows into public preview, letting coding agents run reasoning-heavy tasks — issue triage, CI failure analysis, doc updates — from inside Actions, and shipped security validation for third-party agents like Claude and Codex as a guardrail around them. Vercel extended its switch-without-rewrite thesis from models to agent harnesses with AI SDK 7's HarnessAgent, and Cursor moved code review left of the PR by putting its own Composer 2.5 model behind Bugbot. The throughline across the top of the sector is the same: vendors are wiring agents into the existing development loop — CI, code review, the CLI — and then layering the permissions, validation, and cost metering needed to grant those agents write access safely.
Underneath the agent story, two quieter currents ran in parallel. CI and build infrastructure kept hardening for programmatic control: Depot took its CI API and CLI to GA behind a single OpenAPI contract, and Windmill shipped a daemonless sandboxed container runtime that finally lets Docker scripts run on its cloud. And a security-and-governance undercurrent showed up plainly in Coder's coordinated cross-branch OIDC release and across identity vendors like Auth0 and WorkOS. Activity was concentrated at the top — GitHub and Vercel both posted maximum velocity — while a long tail of platforms (Semgrep, Jenkins, Render, SigNoz) ground out steady, unflashy maintenance.
Leaders
GitHub anchored the week. Agentic Workflows entering public preview makes agents first-class Actions citizens, running on the built-in GITHUB_TOKEN, and GitHub paired it with GA security validation for third-party coding agents and new org-level controls for Copilot code review. The pattern is a platform threading agents through every existing surface rather than shipping a standalone product.
Vercel kept building AI Gateway into a neutral routing layer, but the standout was AI SDK 7's HarnessAgent — one API to run Claude Code, Codex, Pi and other harnesses interchangeably. It is the clearest sign yet that Vercel's strategy is about owning the routing layer rather than betting on any single model or agent.
Cursor tied its in-house model investment directly to a flagship feature: Bugbot now runs on Composer 2.5, cutting average review time from roughly five minutes to about 90 seconds, finding around 10% more bugs, and costing about 22% less, with a new /review command to run it before pushing. Shipping its own model lets Cursor tune speed and cost in a way third-party models wouldn't allow.
Depot reached a milestone in its platform arc: the CI API and CLI are now generally available behind a single OpenAPI/Connect contract that scripts, the CLI, and agents all read from. Full dashboard parity from the terminal is what makes the rest of Depot's CI surface programmable and agent-drivable.
Cohere widened its enterprise model suite with North-Mini-Code-1.0, its first model aimed squarely at coding workloads, extending a lineup that already spans Command A chat, Rerank/Embed retrieval, and Transcribe audio. The launch fits a deliberate broadening from chat-and-retrieval toward a multi-modal portfolio, paired with steady retirement of pre-Command-A models.
Wildcards
Coder ran against the week's agentic grain with a coordinated security wave: a synchronized OIDC email-fallback hardening release pushed across every supported branch (2.29 through the 2.34 mainline) at once, with breaking auth changes and host-trust fixes. It is a reminder that for self-hosted infra vendors, a clean cross-branch security rollout is its own kind of headline release.
Windmill's standout was infrastructure isolation rather than an agent surface: a daemonless, nsjail-sandboxed container runtime that pulls arbitrary images with crane and runs them chrooted inside each job, with no Docker socket. Because it is isolated enough for untrusted multi-tenant code, Docker scripts are now permitted on Windmill Cloud for the first time — a load-bearing enabler more than a flashy feature.
Themes that compounded
- Agents as first-class actors: GitHub, Vercel, Cursor, Depot, Knock, Buildkite, and Rootly all shipped releases that let agents trigger, review, or run work rather than just suggest it.
- Guardrails for autonomous agents: GitHub's third-party-agent validation and Copilot review controls show governance plumbing arriving alongside agent capability, not after it.
- Model-and-harness routing as a product: Vercel's AI Gateway and HarnessAgent, plus Cohere's expanding model suite, treat interchangeable models and harnesses as the thing to own.
- CI hardening for programmatic control: Depot's GA API/CLI and Windmill's sandboxed runtime push build infra toward something scripts and agents can drive directly.
- A security-and-identity undercurrent: Coder's cross-branch OIDC wave, Auth0's M2M-for-agents work, and WorkOS's enterprise primitives all aimed at machine and agent access this week.
Watch this week
Watch whether the agent-governance layer keeps pace with agent capability. GitHub's third-party-agent validation reaching GA and its Copilot review controls landed in the same window as Agentic Workflows going to preview, suggesting the permission-and-approval model is now shipping in lockstep with the autonomous features it gates. On the infrastructure side, Depot's GA API/CLI and Windmill's untrusted-workload sandbox are both fresh enough that next week's follow-ups — richer agent integrations on the API, more sandboxing options — would confirm these are platform foundations rather than one-off launches.