OpenAI | AGI Progress Tracker

Major

GPT-5.4 mini and nano Released

2026-03-17

OpenAI released GPT-5.4 mini and nano, their most capable small models yet optimized for coding and subagents. GPT-5.4 mini significantly improves over GPT-5 mini across coding, reasoning, multimodal understanding, and tool use while running more than 2x faster. GPT-5.4 nano is the smallest, cheapest version for tasks where speed and cost matter most. These models are designed for high-volume workloads where latency directly shapes product experience.

GPT-5.4 mini: 2x faster than GPT-5 mini
54.4% on SWE-Bench Pro (mini)
400K context window (mini)
GPT-5.4 nano: $0.20 per 1M input tokens
Optimized for subagents and coding workflows

openaimodel-releasegptllmefficiency

Sources

Introducing GPT-5.4 mini and nano

Notable

GPT-4o Retired in ChatGPT

2026-02-13

OpenAI retired GPT-4o and several related models from ChatGPT, while keeping them available in the API. The change marked a cleanup of the ChatGPT model lineup as newer GPT-5-family models became the default.

GPT-4o removed from ChatGPT
GPT-4.1 and GPT-4.1 mini retired too
GPT-5 variants became the default
Models remained available in the API

openaimilestonegptllmproduct-launch

Sources

Retiring GPT-4o and other ChatGPT models

Notable

GPT-5.2 Instant Update

2026-02-10

OpenAI shipped a GPT-5.2 Instant update that improved response style and quality in ChatGPT and the API. The update made answers more measured, grounded, and contextually appropriate for advice-seeking and how-to questions.

Improved response style and quality
More measured, grounded tone
Better advice-seeking and how-to answers
Rolled out in ChatGPT and the API

openaimodel-releasegptllmreasoning

Sources

GPT-5.2 Instant Update (February 10, 2026)

Major

Introducing GPT-5.3-Codex

2026-02-05

OpenAI launched GPT-5.3-Codex, its most capable agentic coding model yet. The release combined the Codex and GPT-5 training stacks to improve code generation, reasoning, and steerable long-running coding workflows.

Combined Codex and GPT-5 training stacks
Improved code generation and reasoning
Built for steerable coding workflows
Faster than prior Codex generations
A step toward general-purpose coding agents

openaimodel-releasecodexcodingllm

Sources

Introducing GPT-5.3-Codex

Major

Introducing AgentKit

2025-10-06

OpenAI launched AgentKit, a complete set of tools for building, deploying, and optimizing agents. The release bundled visual workflow building, connector management, embedded chat experiences, and expanded evaluation features for production agent workflows.

Agent Builder for visual workflow design
Connector Registry for data and tool governance
ChatKit for embedded agent experiences
Expanded eval and grading capabilities
Built to speed up production agent development

openaiproduct-launchagentcodingllm

Sources

Introducing AgentKit

Major

GPT-5 Released

2025-08-07

OpenAI released GPT-5, its smartest and most useful model yet with built-in thinking. The launch emphasized stronger coding, reasoning, writing, health, and multimodal performance, plus a unified system that decides when to answer quickly or think longer.

Unified system with built-in thinking
Stronger coding and reasoning performance
Improved writing and health use cases
Multimodal support across common tasks
Available to all users with tiered access

openaimodel-releasegptllmmultimodal

Sources

Introducing GPT-5

Major

OpenAI o3 and o4-mini Released

2025-04-16

OpenAI released o3 and o4-mini, its latest reasoning models designed to think longer before responding. The launch emphasized stronger visual reasoning, agentic tool use, and major gains in coding, math, and science tasks.

Latest o-series reasoning models
Thinking with images and tools
Strong coding, math, and science performance
Agentic tool use improvements
Codex CLI launched alongside the models

openaimodel-releaseo3reasoningmultimodal

Sources

Introducing OpenAI o3 and o4-mini

Major

OpenAI o3-mini Released

2025-01-31

OpenAI released o3-mini, a cost-efficient reasoning model optimized for coding, math, and science. It offered adjustable reasoning effort levels and was made available to free ChatGPT users, democratizing access to reasoning capabilities while maintaining strong performance.

Cost-efficient reasoning model
Adjustable reasoning effort
Free tier availability
Optimized for STEM tasks
Lower latency than o1

openaimodel-releaseo3reasoningllm

Sources

OpenAI o3-mini

Major

OpenAI o3 ARC-AGI Breakthrough

2024-12-20

OpenAI announced o3, achieving a historic breakthrough on the ARC-AGI (Abstraction and Reasoning Corpus for Artificial General Intelligence) benchmark. In low-compute mode, o3 scored 75.7%, surpassing human-level performance (85% threshold for AGI). In high-compute mode with extended thinking time, o3 reached 87.5%, becoming the first AI system to exceed human baseline performance on this challenging abstract reasoning benchmark designed specifically to test general intelligence capabilities.

75.7% on ARC-AGI (low-compute mode)
87.5% on ARC-AGI (high-compute mode)
First AI to surpass human baseline (85%)
Historic AGI milestone
Demonstrates novel reasoning capabilities

openaimodel-releaseo3reasoningarc-agibenchmark

Sources

Major

OpenAI o1 and o1 Pro Released

2024-12-05

OpenAI released the full o1 model and o1 Pro Mode, featuring enhanced reasoning capabilities and extended thinking time. o1 demonstrated strong performance on math, science, and coding tasks, including continued improvements on the ARC-AGI abstract reasoning benchmark. The release also brought o1-mini to all ChatGPT users, democratizing access to reasoning models.

Full o1 model released
o1 Pro Mode with extended thinking
Enhanced reasoning capabilities
Improved ARC-AGI performance
o1-mini available to all ChatGPT users
Significant improvements on STEM tasks

openaimodel-releaseo1reasoningllmarc-agi

Sources

OpenAI o1 System Card

Major

OpenAI o1 Preview Released

2024-09-12

OpenAI introduced the o1 series (o1-preview and o1-mini), featuring new large-scale reinforcement learning algorithms that train models to perform complex reasoning. o1 spent more time thinking before responding, achieving PhD-level accuracy on challenging reasoning tasks. On the ARC-AGI benchmark, o1 achieved a breakthrough score of approximately 21%, significantly advancing AI performance on abstract reasoning tasks.

Chain-of-thought reasoning
Trained with reinforcement learning
PhD-level accuracy on complex tasks
83% on International Math Olympiad qualifying exam
~21% on ARC-AGI benchmark
Ranked in 89th percentile on Codeforces

openaimodel-releaseo1reasoningllmarc-agi

Sources

Learning to Reason with LLMs

Major

OpenAI GPT-4o Released

2024-05-13

OpenAI released GPT-4o ('o' for omni), their flagship model with native multimodal capabilities across text, audio, and vision. It featured response times as fast as humans (232ms average for audio), 50+ language support, and was made available to free ChatGPT users.

Native multimodal (text, audio, vision)
232ms average audio response time
50+ language support
Free tier availability
Improved vision and non-English performance

openaimodel-releasegptmultimodalllm

Sources

GPT-4o Announcement

Major

OpenAI Sora Video Generation Model

2024-02-15

OpenAI unveiled Sora, a text-to-video AI model capable of generating high-quality videos up to 60 seconds long. Sora demonstrated understanding of physics, complex camera movements, and consistent character appearances across frames. It represented a major breakthrough in AI video generation, producing cinematic-quality outputs from text prompts alone.

Up to 60-second video generation
Complex camera movements and physics
Consistent characters and scenes
Multiple shots in single generation
Cinematic quality outputs

openaimodel-releasevideo-generationsoragenerative-aimultimodal

Sources

Sora: Creating Video from Text

Major

OpenAI GPT-4 Turbo Released

2023-11-06

OpenAI released GPT-4 Turbo with a 128K context window (4× larger than previous GPT-4), knowledge cutoff updated to April 2023, and significantly lower pricing. It also introduced JSON mode and reproducible outputs, making it more practical for production applications.

128K context window (4× increase)
Knowledge updated to April 2023
3× cheaper input, 2× cheaper output
JSON mode and reproducible outputs
Better instruction following

openaimodel-releasegptllmlong-context

Sources

GPT-4 Turbo

Major

OpenAI GPTs and Assistants API

2023-11-06

At DevDay, OpenAI announced GPTs (custom versions of ChatGPT), the Assistants API, GPT-4 Turbo with 128K context, and multimodal capabilities. This enabled developers to build AI agents with persistent threads, built-in retrieval, and code interpreter capabilities.

GPTs (custom ChatGPT versions)
Assistants API for agents
GPT-4 Turbo (128K context)
Multimodal vision capabilities
GPT Store announced

openaiproduct-launchgptagentapi-release

Sources

OpenAI DevDay Announcements

Major

ChatGPT Reaches 100 Million Weekly Users

2023-10-12

OpenAI announced that ChatGPT had reached 100 million weekly active users, less than a year after its launch. This milestone cemented ChatGPT as one of the fastest-growing consumer applications in history and demonstrated massive mainstream AI adoption.

100 million weekly active users
Fastest-growing consumer app
Less than 1 year since launch
Mainstream AI adoption achieved
1 million+ API developers

openaiproduct-launchchatgptmilestone

Sources

OpenAI DevDay Announcements

Major

GPT-4 with Vision (GPT-4V)

2023-09-25

OpenAI released GPT-4 with vision capabilities, allowing the model to understand and reason about images. It could analyze photographs, charts, diagrams, and documents, marking a major step toward truly multimodal AI systems.

Image understanding capabilities
Analyze charts and diagrams
Read text from images (OCR)
Visual reasoning and description
Available in ChatGPT and API

openaimodel-releasegptvisionmultimodal

Sources

GPT-4V(ision) System Card

Major

OpenAI DALL-E 3 Released

2023-09-20

OpenAI released DALL-E 3, integrated natively into ChatGPT. The model significantly improved image generation quality and understanding of complex prompts. It could render text within images and follow complex multi-paragraph prompts with high fidelity.

Native ChatGPT integration
Significantly improved quality
Text rendering in images
Follows complex multi-paragraph prompts
Built-in safety mitigations

openaimodel-releasedalleimage-generationmultimodal

Sources

DALL-E 3

Landmark

GPT-4 Released

2023-03-14

OpenAI released GPT-4, a large multimodal model accepting image and text inputs. It demonstrated human-level performance on professional exams (bar exam, SAT) and significantly improved reasoning capabilities. The model was trained with RLHF and constitutional AI techniques for safety.

Multimodal (text + images)
Human-level exam performance
8K and 32K context windows
Training cost >$100M
82% less restricted content vs GPT-3.5

openaimodel-releasegptllmmultimodal

Sources

Notable

ChatGPT Plus Launched

2023-02-01

OpenAI launched ChatGPT Plus, a $20/month subscription offering faster response times, priority access during peak times, and early access to new features. This marked the beginning of monetization for consumer AI products at scale.

$20/month subscription
Faster response times
Priority access during peak hours
Early access to new features
First major consumer AI monetization

openaiproduct-launchchatgptgptmonetization

Sources

ChatGPT Plus

Landmark

ChatGPT Launch

2022-11-30

OpenAI launched ChatGPT, a conversational AI based on GPT-3.5 fine-tuned with RLHF. The product gained 1 million users in 5 days and 100 million in 2 months, becoming the fastest-growing consumer application in history and bringing AI into mainstream consciousness.

1 million users in 5 days
100 million users in 2 months
Fastest-growing consumer app ever
RLHF for safety and alignment
Brought AI to mainstream awareness

openaiproduct-launchchatgptgptllm

Sources

ChatGPT Blog Post

Major

DALL-E 2 Released

2022-04-06

OpenAI released DALL-E 2, a significant upgrade that generated more realistic and accurate images with 4× greater resolution than the original. The model used a CLIP-based architecture and diffusion techniques. It became available to the public via API and web interface, making AI image generation accessible to millions of users.

4× higher resolution than DALL-E 1
CLIP-based diffusion architecture
Public API and web interface
Realistic photorealistic images
Inpainting and outpainting capabilities

openaimodel-releasedalleimage-generationmultimodal

Sources

DALL-E 2 Announcement

Major

Training language models to follow instructions with human feedback

2022-03-04

OpenAI introduced InstructGPT using RLHF (Reinforcement Learning from Human Feedback). The paper showed that 1.3B parameter InstructGPT outperformed 175B GPT-3 on following instructions, demonstrating that alignment with human preferences matters more than scale. This became the foundation for ChatGPT and modern alignment techniques.

RLHF alignment methodology
1.3B model outperforms 175B GPT-3 on instruction-following
Preference-based training
Foundation for ChatGPT
Alignment > Scale principle

research-paperrlhfinstructgptalignment

Sources

InstructGPT Paper (NeurIPS 2022)

Notable

OpenAI Codex API Released

2021-08-10

OpenAI released the Codex API, providing access to the model powering GitHub Copilot. The model was trained on 54 million GitHub repositories and could interpret natural language and execute over a dozen programming languages.

API access to Copilot model
Trained on 54M GitHub repos
12B parameters
Support for 12+ programming languages
Natural language to code translation

openaiapi-releasecodingcodex

Sources

OpenAI Codex API

Landmark

DALL-E: Creating Images from Text

2021-01-05

OpenAI introduced DALL-E, a 12-billion parameter version of GPT-3 trained to generate images from text descriptions. It could create novel combinations of concepts, transform existing images, and even render text. This marked a major milestone in multimodal AI.

12B parameter transformer model
Text-to-image generation
Zero-shot image generation
Combining concepts in novel ways
Foundation for modern image generation

openairesearch-paperdalleimage-generationmultimodal

Sources

DALL-E Paper

Landmark

GPT-3: Language Models are Few-Shot Learners

2020-05-28

OpenAI released GPT-3 with 175 billion parameters, demonstrating remarkable few-shot and zero-shot learning capabilities. The model could perform tasks it wasn't explicitly trained on, simply by providing examples in the prompt. This demonstrated the power of scale in language models.

175B parameters (100x GPT-2)
Few-shot learning without fine-tuning
Zero-shot capabilities across tasks
Trained on 300B tokens
API waitlist opened immediately

openairesearch-papergptllmmodel-release

Sources

Major

GPT-2 Full Model Released

2019-02-14

OpenAI released the full GPT-2 model with 1.5 billion parameters after a staged release process and partnership with researchers. The model demonstrated impressive text generation capabilities but also raised ongoing discussions about AI safety and dual-use concerns.

Full 1.5B parameter model released
Staged release with safety research
Impressive zero-shot capabilities
No fine-tuning needed for many tasks
Established safety release practices

openaimodel-releasegptllmopen-source

Sources

GPT-2: 1.5B Release

Notable

GPT-2 Preview Released

2018-11-14

OpenAI released a small preview of GPT-2 but withheld the full model citing concerns about potential misuse for generating fake news. The decision sparked debate about AI safety and responsible release practices in the research community.

1.5B parameters (full model withheld)
Concerns about fake news generation
Staged release approach
Sparked AI safety debate
Full model released later in 2019

openairesearch-papergptllm

Sources

Better Language Models and Their Implications

Major

GPT-1: Improving Language Understanding by Generative Pre-Training

2018-06-11

OpenAI introduced GPT-1 (Generative Pre-trained Transformer), demonstrating that unsupervised pre-training on large text corpora followed by supervised fine-tuning could achieve state-of-the-art results on various NLP benchmarks. This established the pre-training paradigm.

First GPT model - 117M parameters
Unsupervised pre-training + supervised fine-tuning
Transformer decoder architecture
Trained on BookCorpus (7,000 books)
Established pre-training paradigm for NLP

openairesearch-papergptllmtransformer

Sources

GPT-1 Paper

Notable

OpenAI Universe Released

2016-12-05

OpenAI released Universe, a platform for measuring and training AI agents across diverse environments including games, web interfaces, and applications. It provided a single interface to thousands of environments using Virtual Network Computing (VNC).

Unified interface for AI training environments
Support for games, web, and apps
Used VNC for environment control
Aimed to accelerate reinforcement learning research
Later evolved into more focused projects

openaiopen-sourceproduct-launchreinforcement-learningagent

Sources

OpenAI Universe Blog Post