OpenAI Releases GPT-5.4: Native Computer Use Meets Agentic Workflows

OpenAI has released GPT-5.4, described as its "most capable and efficient frontier model for professional work." The new model marks a significant milestone in AI development, combining frontier coding capabilities with autonomous agent execution.

Key Capabilities

Computer Use at Scale

GPT-5.4 is OpenAI's first general-purpose model with native, state-of-the-art computer-use capabilities. On OSWorld-Verified benchmarks—which measure a model's ability to navigate desktops via screenshots and keyboard/mouse actions—GPT-5.4 achieves a 75% success rate, exceeding human performance at 72.4%. This represents a massive leap from GPT-5.2's 47.3%.

The model excels at writing code for computer automation via libraries like Playwright, and can issue mouse and keyboard commands in response to visual inputs. Developers can now build agents that complete complex workflows across websites and software systems with minimal manual intervention.

Reasoning and Token Efficiency

GPT-5.4 incorporates enhanced reasoning capabilities while dramatically improving token efficiency. The model uses significantly fewer tokens than its predecessors when solving problems—translating to reduced API costs and faster inference speeds. On GDPval (a test of knowledge work across 44 occupations), GPT-5.4 achieves 83% performance, matching or exceeding industry professionals compared to 70.9% for GPT-5.2.

Practical Professional Work

The model excels at real-world tasks professionals actually perform. On internal spreadsheet modeling benchmarks, GPT-5.4 achieves 87.3% accuracy compared to 68.4% for GPT-5.2. Human raters preferred GPT-5.4's presentations over GPT-5.2 68% of the time, citing stronger aesthetics and visual variety.

Tool Integration and Agentic Features

A major advancement is tool search—allowing models to work efficiently with large tool ecosystems. Previously, all tool definitions required inclusion in the prompt upfront. With tool search, GPT-5.4 looks up tools only when needed, reducing token usage by 47% on benchmark tasks while maintaining accuracy.

The model supports a 1M token context window (in API/Codex), enabling agents to plan, execute, and verify tasks across extended horizons. On BrowseComp (testing persistent web research), GPT-5.4 achieves 82.7% success, a 17% absolute improvement over GPT-5.2.

Availability and Pricing

GPT-5.4 is available now across ChatGPT (as "GPT-5.4 Thinking" and "Pro"), the API, and Codex. In ChatGPT, it replaces GPT-5.2 Thinking for Plus, Team, and Pro users. API pricing reflects increased capabilities: $2.50/M input tokens and $15/M output tokens (vs. $1.75/$14 for GPT-5.2).

Source: OpenAI's official announcement