Codex, Devin Desktop, Cursor SDK: Agent Platforms Arrive

Three things happened between June 2 and June 8. OpenAI confirmed it is folding Codex into ChatGPT and building what insiders call a "superapp." Cognition rebranded Windsurf as Devin Desktop and shipped an open protocol for running multiple agents side by side. Cursor released an SDK update that lets developers define custom tools, nest subagents to any depth, and control which operations run without human approval.

These are three separate companies making three separate product decisions. But they are all answering the same question, and that question is no longer "which coding agent should I use?" It is "how do I run several agents at once without everything falling apart?"

That shift matters if you are a business operator evaluating AI tools. The agent you pick today is less important than the platform you build around it.

OpenAI: Codex stops being a developer tool

On June 7, the Financial Times reported that OpenAI is planning the biggest ChatGPT redesign since its 2022 launch. The core change: Codex, which started as a standalone coding agent, is becoming a top-level feature inside ChatGPT. Greg Brockman is leading the effort. The unified app will also include Atlas, OpenAI's browser. The goal is one interface for chat, code, and browsing.

Five days earlier, OpenAI had announced that Codex passed 5 million weekly active users, up sixfold since February. The buried number: 20% of those users are not developers. Analysts, marketers, and finance people are the fastest-growing segment, and they are growing three times faster than the developer base. OpenAI launched six role-specific plugins for data analytics, creative production, sales, product design, public equity investing, and investment banking. Together those plugins connect 62 enterprise applications through 110 bundled skills.

A new feature called Sites lets Codex generate and deploy interactive web apps on OpenAI's infrastructure. Annotations let users highlight a specific region of a document, spreadsheet, or slide and prompt Codex to change only that region, rather than regenerating the entire file.

For anyone running a business, the signal is straightforward. OpenAI is betting that one platform serving both developers and non-technical staff wins over separate tools for each audience. Anthropic took the opposite approach: Claude Code for developers, Claude Cowork for everyone else. Both strategies have tradeoffs. The risk for OpenAI is that splitting attention across too many use cases dilutes developer performance. The risk for Anthropic is that two products mean twice the maintenance and slower integration between workflows.

Sources: OpenAI announcement, Financial Times via TechFastForward, ByteIota analysis, WebProNews

Devin Desktop: One workspace, multiple agents

On June 2, Cognition (the company behind the Devin agent) rebranded Windsurf as Devin Desktop. The product is no longer an AI code editor. It is a platform for running and managing multiple agents at once.

The centerpiece is the Agent Command Center, a Kanban-style view that shows every local and cloud agent session running in a single dashboard. "Spaces" group related sessions, pull requests, and files so agents share context rather than starting from scratch each time.

The more interesting technical move is the Agent Client Protocol, or ACP. It is an open standard that lets third-party agents run inside Devin Desktop with the same capabilities as Devin's native agents. At launch, Codex, Claude Agent, and OpenCode are supported. If you want Codex handling one file and Claude Agent another, running in parallel inside a shared workspace, ACP is what makes that work without custom orchestration code.

Cognition also rewrote its Cascade agent in Rust. The replacement, called Devin Local, is reportedly 30% more token-efficient and supports subagents natively. Cascade support ends July 1, so teams running Windsurf workflows need to migrate.

The pricing did not change. Existing Windsurf and Devin subscribers get Devin Desktop at no extra cost.

For business teams, ACP is the feature worth watching. If Cognition can get other editors and runtimes to adopt the protocol, it becomes a real interoperability layer for AI agents. If it stays Devin-specific, it is still useful but not the industry standard it aims to be.

Sources: ToolNav coverage, Cognition announcement

Cursor SDK: Custom tools, auto-review, nested subagents

On June 4, Cursor shipped a batch of updates to its TypeScript and Python SDKs. The three that matter most:

Custom tools. You can now pass your own function definitions to a Cursor agent as custom tools, exposed through a built-in MCP server called custom-user-tools. Before this, adding a custom capability meant standing up your own MCP server. Now a function definition is enough. Custom tools are visible to every subagent in a run, so you define once and they propagate.

Auto-review. Headless SDK agents normally run all tool calls without asking for approval. The new local.autoReview flag routes those calls through a classifier that decides which to allow and which to hold for review. You control the classifier with natural-language instructions in permissions.json, specifying what to allow and what to block. This matters for teams running agents in CI or production where no human is watching.

Nested subagents. Subagents can now spawn their own subagents to any depth. A reviewer agent delegates to a test-writer, which delegates to a linter, each keeping its own prompt and model. No configuration needed; it works automatically.

The SDK also added JSONL persistence as an alternative to SQLite, custom store interfaces for backing agent state with Postgres or in-memory stores, and a batch of reliability fixes for cloud streaming, run correlation, and error handling.

Cursor's direction is clear. It is building the SDK layer for teams that want to compose multi-agent workflows programmatically rather than through a GUI. That is a different entry point than Devin Desktop's Kanban or ChatGPT's unified chat window, and it reflects where these products are diverging even as they converge on the same problem.

Sources: Cursor changelog

What this means if you are choosing tools right now

The coding agent market has moved past the "which model writes better code" phase. Every major player is now building infrastructure for running multiple agents, managing shared context, and controlling which operations need human approval.

If you are evaluating tools for your team:

Pick based on workflow, not benchmark scores. Claude Code still leads SWE-bench at 88.6%. Codex with GPT-5.5 scores 82.7% on Terminal-Bench. Those numbers tell you about model capability, not about whether the tool fits how your team actually works.
Think about who will use it. If your team includes non-developers who need to query data or generate reports, OpenAI's all-in-one approach has practical appeal. If your team is purely engineering-focused, Claude Code or Cursor's SDK may be a better fit.
Do not overcommit. ACP, MCP, and Cursor's custom tools all point toward a future where agents interoperate. The tool you choose today should not lock you out of running other agents tomorrow. Prefer platforms that support open protocols.

The three-way convergence this week is not a coincidence. Multi-agent orchestration is the problem every serious vendor is trying to solve. The tools that solve it first, and solve it in a way that plays well with others, will have an advantage that raw model benchmarks cannot match.

OpenAI: Codex stops being a developer tool

Devin Desktop: One workspace, multiple agents

Cursor SDK: Custom tools, auto-review, nested subagents

What this means if you are choosing tools right now

Like this kind of writing?