GPT-5.3-Codex: OpenAI's Self-Improving Coding Agent

openai, gpt-5, codex, agentic-ai, coding-agents, cybersecurity

Release

OpenAI released GPT-5.3-Codex on February 5, 2026. It combines the frontier coding performance of GPT-5.2-Codex with the reasoning and professional knowledge capabilities of GPT-5.2, while running 25% faster.

The model is available in the Codex app and via the OpenAI API. Pricing has not been disclosed.

The First Self-Improving Model

GPT-5.3-Codex is the first model instrumental in creating itself. The Codex team used early versions to:

  • Debug its own training runs
  • Manage its own deployment
  • Diagnose test results and evaluations
  • Optimize the inference harness
  • Identify context rendering bugs and low cache hit rates
  • Dynamically scale GPU clusters during launch

OpenAI reports that the model “accelerated its own development” and that researchers and engineers describe their jobs as “fundamentally different from what it was just two months ago.”

Benchmarks

BenchmarkResultNotes
SWE-Bench ProState-of-the-artReal-world software engineering across 4 languages
Terminal-Bench 2.0State-of-the-artTerminal skills for coding agents, fewer tokens than prior models
OSWorldStrong performanceAgentic computer-use tasks in visual desktop environment
GDPvalMatches GPT-5.2Professional knowledge work across 44 occupations

SWE-Bench Pro is more contamination-resistant, challenging, and industry-relevant than SWE-bench Verified. It spans Python, JavaScript, TypeScript, and Go.

Agentic Capabilities

GPT-5.3-Codex goes beyond code generation. OpenAI describes it as “an agent that can do nearly anything developers and professionals can do on a computer.”

Software lifecycle support:

  • Debugging
  • Deploying
  • Monitoring
  • Writing PRDs
  • Editing copy
  • User research
  • Tests and metrics

Beyond software:

  • Building slide decks
  • Analyzing data in spreadsheets
  • Creating functional games and apps over multi-day, multi-million-token runs

OpenAI demonstrated two games built autonomously by GPT-5.3-Codex: a racing game (v2) and a diving game, using generic follow-up prompts like “fix the bug” or “improve the game.”

Interactive Steering

The Codex app now supports real-time interaction. Instead of waiting for final output, users can:

  • Ask questions as the model works
  • Discuss approaches mid-execution
  • Steer toward solutions without losing context
  • Receive frequent progress updates

This is enabled via Settings > General > Follow-up behavior in the Codex app.

Cybersecurity Classification

GPT-5.3-Codex is the first model OpenAI classifies as High capability for cybersecurity-related tasks under their Preparedness Framework. It is also the first model directly trained to identify software vulnerabilities.

OpenAI states they don’t have “definitive evidence it can automate cyber attacks end-to-end,” but are taking a precautionary approach with their “most comprehensive cybersecurity safety stack to date.” This includes safety mitigations and enhanced monitoring.

Context

This release follows:

  • GPT-5 — August 7, 2025
  • GPT-5.1 — November 12, 2025
  • GPT-5.2 — December 11, 2025
  • GPT-5.2-Codex — December 2025 (exact date unconfirmed)
  • GPT-5.3-Codex — February 5, 2026

The original Codex was released in August 2021 and deprecated in March 2023. OpenAI relaunched Codex as an agentic app in 2025.

Timeline Update

This release has been added to the whatis ai timeline.

References

  1. Introducing GPT-5.3-Codex — OpenAI
  2. GPT-5.3-Codex System Card — OpenAI
  3. Strengthening Cyber Resilience — OpenAI
  4. GDPval — OpenAI
  5. SWE-Bench Pro
  6. Terminal-Bench 2.0

Configuration details reflect a production environment at time of writing. Implementation specifics vary based on tooling versions, platform updates, and organizational requirements. Validate approaches against current documentation before deployment.