GPT-5.4
OpenAI's most capable and efficient frontier model for professional work
75.0% OSWorld (beats human) • 92.8% GPQA Diamond • 73.3% ARC-AGI-2 • 47% token reduction with tool search
GPT-5.4 Features & Capabilities
The first OpenAI model combining reasoning, coding, and native computer use in one release
Native Computer Use
GPT-5.4 achieves 75.0% on OSWorld, surpassing human performance (72.4%). Operates desktops via Playwright code and screenshot-based mouse/keyboard commands.
Advanced Reasoning
GPT-5.4 scores 73.3% on ARC-AGI-2 (up from 52.9% in GPT-5.2) and 92.8% GPQA Diamond — a genuine reasoning advance, not just a tool-use wrapper.
Frontier Coding
GPT-5.4 combines GPT-5.3-Codex coding strengths with broader capabilities. Scores 57.7% SWE-Bench Pro and 75.1% Terminal-Bench 2.0. Up to 1.5x faster token velocity in /fast mode.
Tool Search (47% Token Reduction)
New tool search feature loads tool definitions on demand instead of upfront. Reduces total token usage by 47% on MCP Atlas benchmark while maintaining the same accuracy.
1M Token Context
GPT-5.4 supports up to 1 million tokens of context in Codex, enabling agents to plan, execute, and verify tasks across long horizons. Standard window is 272K tokens.
Knowledge Work Leader
GPT-5.4 scores 83.0% on GDPval (up from 70.9%), 87.3% on IB Modeling Tasks, and produces 33% fewer false claims than GPT-5.2. Best factual accuracy OpenAI has released.
GPT-5.4 Benchmark Results
State-of-the-art performance across reasoning, coding, computer use, and knowledge work
Reasoning & Science
Coding & Engineering
Computer Use & Vision
Knowledge Work
GPT-5.4 Full Benchmark Comparison
GPT-5.4 vs GPT-5.2 — complete performance data
| Benchmark | GPT-5.4 | GPT-5.2 |
|---|---|---|
| OSWorld (Computer Use) | 75.0% | 47.3% |
| ARC-AGI-2 | 73.3% | 52.9% |
| GPQA Diamond | 92.8% | 88.1% |
| GDPval | 83.0% | 70.9% |
| HLE (with tools) | 53.8% | 45.5% |
| SWE-Bench Pro | 57.7% | 43.2% |
| Terminal-Bench 2.0 | 75.1% | 61.4% |
| IB Modeling Tasks | 87.3% | 74.1% |
Source: OpenAI official release, March 5, 2026
GPT-5.4 Pricing
API pricing for GPT-5.4 and GPT-5.4 Pro
| Model | Input | Cached Input | Output |
|---|---|---|---|
| GPT-5.4 | $2.50/1M | $0.25/1M | $15/1M |
| GPT-5.4 Pro | $30/1M | — | $180/1M |
Batch and Flex processing available at half the standard rate. Priority processing at 2x. GPT-5.2 retires June 5, 2026.
GPT-5.4 FAQ
Frequently asked questions about GPT-5.4
What is GPT-5.4?
GPT-5.4 is OpenAI's most capable frontier model, released March 5, 2026. It combines reasoning, coding, and native computer use in a single model — the first time OpenAI has unified these capabilities in one release.
How does GPT-5.4 compare to GPT-5.2?
GPT-5.4 significantly outperforms GPT-5.2: ARC-AGI-2 jumps from 52.9% to 73.3%, GDPval from 70.9% to 83.0%, OSWorld from 47.3% to 75.0% (surpassing human performance), and false claims are reduced by 33%.
What is GPT-5.4 computer use?
GPT-5.4 can operate computers natively via Playwright code and screenshot-based mouse/keyboard commands. It achieves 75.0% on OSWorld, surpassing the human baseline of 72.4% — making it the strongest model for desktop automation.
What is GPT-5.4 pricing?
GPT-5.4 API pricing: $2.50/1M input tokens, $0.25/1M cached input, $15/1M output. GPT-5.4 Pro: $30/1M input, $180/1M output. Batch and Flex processing available at half rate. GPT-5.2 retires June 5, 2026.
What is tool search in GPT-5.4?
Tool search is a new feature that loads tool definitions on demand instead of including all definitions upfront. On 250 tasks with 36 MCP servers enabled, it reduced total token usage by 47% while maintaining the same accuracy — a major cost saving for enterprise agentic workflows.
What context window does GPT-5.4 support?
GPT-5.4 supports a standard 272K token context window, with 1M token context available in Codex (billed at 2x the normal rate beyond 272K). It is the first OpenAI model to support context lengths beyond 256K tokens.
How does GPT-5.4 compare to Gemini 3.1 Pro?
GPT-5.4 leads on computer use (75.0% OSWorld vs no equivalent), knowledge work (83.0% GDPval), and ARC-AGI-2 (73.3% vs 77.1% for Gemini 3.1 Pro). Gemini 3.1 Pro leads on coding (80.6% SWE-Bench vs 57.7%) and GPQA Diamond (94.3% vs 92.8%). Both are frontier models with different strengths.
Is GPT-5.4 available in ChatGPT?
Yes. GPT-5.4 Thinking is available to Plus, Team, and Pro subscribers in ChatGPT, replacing GPT-5.2 Thinking. GPT-5.4 Pro is available to Pro and Enterprise plans. Enterprise and Edu admins can enable early access via admin settings.
About GPT-5.4
GPT-5.4 is OpenAI's flagship reasoning model released on March 5, 2026. It is the first mainline model combining reasoning, coding (GPT-5.3-Codex), and native computer use in a single release. GPT-5.4 surpasses human performance on OSWorld desktop navigation, reduces false claims by 33% vs GPT-5.2, and introduces tool search that cuts token costs by 47% for complex agentic workflows. Available as gpt-5.4 and gpt-5.4-pro via API.
Important Notice: Gemini3.us is an independent enthusiast community and developer platform. We are not affiliated with, endorsed by, or officially connected to OpenAI. We provide paid access to OpenAI's official API services to support our infrastructure and operations.
Try GPT-5.4 Now
Experience OpenAI's most capable model — native computer use, 1M context, and frontier reasoning