gemini3.us
Released March 5, 2026

GPT-5.4

OpenAI's most capable and efficient frontier model for professional work

75.0% OSWorld (beats human) • 92.8% GPQA Diamond • 73.3% ARC-AGI-2 • 47% token reduction with tool search

75.0%
OSWorld (Computer Use)
92.8%
GPQA Diamond
73.3%
ARC-AGI-2
83.0%
GDPval Knowledge Work

GPT-5.4 Features & Capabilities

The first OpenAI model combining reasoning, coding, and native computer use in one release

Native Computer Use

GPT-5.4 achieves 75.0% on OSWorld, surpassing human performance (72.4%). Operates desktops via Playwright code and screenshot-based mouse/keyboard commands.

Advanced Reasoning

GPT-5.4 scores 73.3% on ARC-AGI-2 (up from 52.9% in GPT-5.2) and 92.8% GPQA Diamond — a genuine reasoning advance, not just a tool-use wrapper.

Frontier Coding

GPT-5.4 combines GPT-5.3-Codex coding strengths with broader capabilities. Scores 57.7% SWE-Bench Pro and 75.1% Terminal-Bench 2.0. Up to 1.5x faster token velocity in /fast mode.

Tool Search (47% Token Reduction)

New tool search feature loads tool definitions on demand instead of upfront. Reduces total token usage by 47% on MCP Atlas benchmark while maintaining the same accuracy.

1M Token Context

GPT-5.4 supports up to 1 million tokens of context in Codex, enabling agents to plan, execute, and verify tasks across long horizons. Standard window is 272K tokens.

Knowledge Work Leader

GPT-5.4 scores 83.0% on GDPval (up from 70.9%), 87.3% on IB Modeling Tasks, and produces 33% fewer false claims than GPT-5.2. Best factual accuracy OpenAI has released.

GPT-5.4 Benchmark Results

State-of-the-art performance across reasoning, coding, computer use, and knowledge work

Reasoning & Science

ARC-AGI-2 (Verified)
Abstract reasoning — up from 52.9% in GPT-5.2
73.3%
GPQA Diamond
PhD-level scientific knowledge
92.8%
Humanity's Last Exam (with tools)
Academic reasoning — up from 45.5% in GPT-5.2
53.8%
FrontierMath Tier 1-3
Advanced mathematical reasoning
62.4%

Coding & Engineering

SWE-Bench Pro (Public)
Real-world software engineering tasks
57.7%
Terminal-Bench 2.0
CLI and terminal task completion
75.1%
BrowseComp
Web browsing and research tasks
71.2%
Toolathlon
Multi-step tool use with real APIs
68.9%

Computer Use & Vision

OSWorld-Verified (desktop)
Surpasses human baseline of 72.4%
75.0%
WebArena-Verified (browser)
Browser navigation and task completion
82.3%
Online-Mind2Web (screenshots)
Screenshot-based web interaction
69.1%
MMMU Pro (no tools)
Multimodal understanding
78.4%

Knowledge Work

GDPval (wins or ties)
44 occupations, 9 industries — up from 70.9%
83.0%
IB Modeling Tasks
Investment banking spreadsheet tasks
87.3%
Presentation Preference
Human raters preferred GPT-5.4 presentations
71%
False Claims Reduction
Fewer factual errors vs GPT-5.2
-33%

GPT-5.4 Full Benchmark Comparison

GPT-5.4 vs GPT-5.2 — complete performance data

BenchmarkGPT-5.4GPT-5.2
OSWorld (Computer Use)75.0%47.3%
ARC-AGI-273.3%52.9%
GPQA Diamond92.8%88.1%
GDPval83.0%70.9%
HLE (with tools)53.8%45.5%
SWE-Bench Pro57.7%43.2%
Terminal-Bench 2.075.1%61.4%
IB Modeling Tasks87.3%74.1%

Source: OpenAI official release, March 5, 2026

GPT-5.4 Pricing

API pricing for GPT-5.4 and GPT-5.4 Pro

ModelInputCached InputOutput
GPT-5.4$2.50/1M$0.25/1M$15/1M
GPT-5.4 Pro$30/1M$180/1M

Batch and Flex processing available at half the standard rate. Priority processing at 2x. GPT-5.2 retires June 5, 2026.

GPT-5.4 FAQ

Frequently asked questions about GPT-5.4

What is GPT-5.4?

GPT-5.4 is OpenAI's most capable frontier model, released March 5, 2026. It combines reasoning, coding, and native computer use in a single model — the first time OpenAI has unified these capabilities in one release.

How does GPT-5.4 compare to GPT-5.2?

GPT-5.4 significantly outperforms GPT-5.2: ARC-AGI-2 jumps from 52.9% to 73.3%, GDPval from 70.9% to 83.0%, OSWorld from 47.3% to 75.0% (surpassing human performance), and false claims are reduced by 33%.

What is GPT-5.4 computer use?

GPT-5.4 can operate computers natively via Playwright code and screenshot-based mouse/keyboard commands. It achieves 75.0% on OSWorld, surpassing the human baseline of 72.4% — making it the strongest model for desktop automation.

What is GPT-5.4 pricing?

GPT-5.4 API pricing: $2.50/1M input tokens, $0.25/1M cached input, $15/1M output. GPT-5.4 Pro: $30/1M input, $180/1M output. Batch and Flex processing available at half rate. GPT-5.2 retires June 5, 2026.

What is tool search in GPT-5.4?

Tool search is a new feature that loads tool definitions on demand instead of including all definitions upfront. On 250 tasks with 36 MCP servers enabled, it reduced total token usage by 47% while maintaining the same accuracy — a major cost saving for enterprise agentic workflows.

What context window does GPT-5.4 support?

GPT-5.4 supports a standard 272K token context window, with 1M token context available in Codex (billed at 2x the normal rate beyond 272K). It is the first OpenAI model to support context lengths beyond 256K tokens.

How does GPT-5.4 compare to Gemini 3.1 Pro?

GPT-5.4 leads on computer use (75.0% OSWorld vs no equivalent), knowledge work (83.0% GDPval), and ARC-AGI-2 (73.3% vs 77.1% for Gemini 3.1 Pro). Gemini 3.1 Pro leads on coding (80.6% SWE-Bench vs 57.7%) and GPQA Diamond (94.3% vs 92.8%). Both are frontier models with different strengths.

Is GPT-5.4 available in ChatGPT?

Yes. GPT-5.4 Thinking is available to Plus, Team, and Pro subscribers in ChatGPT, replacing GPT-5.2 Thinking. GPT-5.4 Pro is available to Pro and Enterprise plans. Enterprise and Edu admins can enable early access via admin settings.

About GPT-5.4

GPT-5.4 is OpenAI's flagship reasoning model released on March 5, 2026. It is the first mainline model combining reasoning, coding (GPT-5.3-Codex), and native computer use in a single release. GPT-5.4 surpasses human performance on OSWorld desktop navigation, reduces false claims by 33% vs GPT-5.2, and introduces tool search that cuts token costs by 47% for complex agentic workflows. Available as gpt-5.4 and gpt-5.4-pro via API.

Important Notice: Gemini3.us is an independent enthusiast community and developer platform. We are not affiliated with, endorsed by, or officially connected to OpenAI. We provide paid access to OpenAI's official API services to support our infrastructure and operations.

Try GPT-5.4 Now

Experience OpenAI's most capable model — native computer use, 1M context, and frontier reasoning