GPT-5.3-Codex: OpenAI's Self-Improving Coding Agent

February 5, 2026 openai, gpt-5, codex, agentic-ai, coding-agents, cybersecurity

Release

OpenAI released GPT-5.3-Codex on February 5, 2026. It combines the frontier coding performance of GPT-5.2-Codex with the reasoning and professional knowledge capabilities of GPT-5.2, while running 25% faster.

The model is available in the Codex app and via the OpenAI API. Pricing has not been disclosed.

The First Self-Improving Model

GPT-5.3-Codex is the first model instrumental in creating itself. The Codex team used early versions to:

Debug its own training runs
Manage its own deployment
Diagnose test results and evaluations
Optimize the inference harness
Identify context rendering bugs and low cache hit rates
Dynamically scale GPU clusters during launch

OpenAI reports that the model “accelerated its own development” and that researchers and engineers describe their jobs as “fundamentally different from what it was just two months ago.”

Benchmarks

Benchmark	Result	Notes
SWE-Bench Pro	State-of-the-art	Real-world software engineering across 4 languages
Terminal-Bench 2.0	State-of-the-art	Terminal skills for coding agents, fewer tokens than prior models
OSWorld	Strong performance	Agentic computer-use tasks in visual desktop environment
GDPval	Matches GPT-5.2	Professional knowledge work across 44 occupations

SWE-Bench Pro is more contamination-resistant, challenging, and industry-relevant than SWE-bench Verified. It spans Python, JavaScript, TypeScript, and Go.

Agentic Capabilities

GPT-5.3-Codex goes beyond code generation. OpenAI describes it as “an agent that can do nearly anything developers and professionals can do on a computer.”

Software lifecycle support:

Debugging
Deploying
Monitoring
Writing PRDs
Editing copy
User research
Tests and metrics

Beyond software:

Building slide decks
Analyzing data in spreadsheets
Creating functional games and apps over multi-day, multi-million-token runs

OpenAI demonstrated two games built autonomously by GPT-5.3-Codex: a racing game (v2) and a diving game, using generic follow-up prompts like “fix the bug” or “improve the game.”

Interactive Steering

The Codex app now supports real-time interaction. Instead of waiting for final output, users can:

Ask questions as the model works
Discuss approaches mid-execution
Steer toward solutions without losing context
Receive frequent progress updates

This is enabled via Settings > General > Follow-up behavior in the Codex app.

Cybersecurity Classification

GPT-5.3-Codex is the first model OpenAI classifies as High capability for cybersecurity-related tasks under their Preparedness Framework. It is also the first model directly trained to identify software vulnerabilities.

OpenAI states they don’t have “definitive evidence it can automate cyber attacks end-to-end,” but are taking a precautionary approach with their “most comprehensive cybersecurity safety stack to date.” This includes safety mitigations and enhanced monitoring.

Context

This release follows:

GPT-5 — August 7, 2025
GPT-5.1 — November 12, 2025
GPT-5.2 — December 11, 2025
GPT-5.2-Codex — December 2025 (exact date unconfirmed)
GPT-5.3-Codex — February 5, 2026

The original Codex was released in August 2021 and deprecated in March 2023. OpenAI relaunched Codex as an agentic app in 2025.

Timeline Update

This release has been added to the whatis ai timeline.

References

Configuration details reflect a production environment at time of writing. Implementation specifics vary based on tooling versions, platform updates, and organizational requirements. Validate approaches against current documentation before deployment.

← Back to Journal