Anthropic | AGI Progress Tracker

Major

Claude Sonnet 4.6 Released

2026-02-17

Anthropic released Claude Sonnet 4.6, a full upgrade across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. The model features a 1M token context window in beta and approaches Opus-level intelligence at a lower price point. Users preferred Sonnet 4.6 over Sonnet 4.5 roughly 70% of the time in Claude Code testing, and even preferred it to Opus 4.5 59% of the time due to less overengineering and better instruction following.

Approaches Opus-level intelligence
1M token context window (beta)
Significantly improved computer use skills
Better instruction following and consistency
Pricing remains $3/$15 per million tokens

anthropicmodel-releaseclaudellmreasoningcoding

Sources

Introducing Claude Sonnet 4.6

Major

Claude Opus 4.6 Released

2026-02-05

Anthropic released Claude Opus 4.6, their most intelligent model with industry-leading performance on agentic coding, computer use, tool use, search, and finance. The model achieved state-of-the-art scores on Terminal-Bench 2.0 and Humanity's Last Exam, featuring improved coding skills, better code review and debugging, and a 1M token context window in beta. It outperformed GPT-5.2 by 144 Elo points on GDPval-AA economically valuable knowledge work tasks.

State-of-the-art on Terminal-Bench 2.0
Highest score on Humanity's Last Exam
1M token context window (beta)
Improved agentic coding and computer use
Outperforms GPT-5.2 on knowledge work tasks

anthropicmodel-releaseclaudellmreasoningagent

Sources

Introducing Claude Opus 4.6

Notable

Anthropic Launches Labs

2026-01-13

Anthropic launched Labs as a team focused on incubating experimental products at the frontier of Claude's capabilities. The announcement highlighted a faster path from research previews into products that can scale reliably for customers.

Incubates experimental frontier products
Builds on Claude Code and MCP momentum
Supports research previews and product scaling
Focuses on agentic desktop experiences

anthropicmilestoneagentproduct-launchclaude

Sources

Introducing Labs

Major

Claude Opus 4.5 Released

2025-11-24

Anthropic released Claude Opus 4.5, positioning it as a major step forward for coding, agents, and computer use. The release emphasized stronger real-world software engineering performance, better everyday productivity, and wider availability across Anthropic's products and partner platforms.

Best-in-class coding and agent performance
Strong real-world software engineering results
Improved deep research and spreadsheet workflows
Available on apps, API, and cloud platforms
Lower pricing than prior Opus-tier releases

anthropicmodel-releaseclaudellmcoding

Sources

Introducing Claude Opus 4.5

Major

Claude Sonnet 4.5 Released

2025-09-29

Anthropic released Claude Sonnet 4.5, describing it as the best coding model in the world and the strongest model for building complex agents. The release added major upgrades to Claude Code, the API, and the Claude app experience.

Best coding model in the world
Strongest model for complex agents
Major Claude Code improvements
Context editing and memory tool in the API
Checkpoints and native VS Code extension

anthropicmodel-releaseclaudecodingagent

Sources

Introducing Claude Sonnet 4.5

Major

Claude 3.7 Sonnet Released

2025-02-25

Anthropic released Claude 3.7 Sonnet, their first hybrid reasoning model that can produce both quick responses and extended thinking for complex tasks. It achieved state-of-the-art performance on SWE-bench Verified (62.3%) and featured enhanced agentic coding capabilities.

First hybrid reasoning model
Standard and extended thinking modes
62.3% on SWE-bench Verified
State-of-the-art agentic coding
2x improvement in coding tasks

anthropicmodel-releaseclaudellmcodingreasoning

Sources

Claude 3.7 Sonnet and Claude Code

Major

Claude Code Released

2025-02-25

Anthropic released Claude Code, an agentic coding tool that works directly in the terminal. It can search and read code, edit files, run tests, and use command-line tools. Integrated with Claude 3.7 Sonnet, it achieved state-of-the-art performance on SWE-bench Verified.

Terminal-integrated agentic coding
Search, read, edit code
Run tests and command-line tools
State-of-the-art on SWE-bench
Natural language project management

anthropicproduct-launchclaudecodingagent

Sources

Claude Code

Major

Claude 3.5 Sonnet with Computer Use

2024-10-22

Anthropic released an upgraded Claude 3.5 Sonnet with 'computer use' capabilities, allowing the AI to directly interact with computers - moving the mouse, clicking buttons, and typing text. This enabled AI agents to perform complex multi-step tasks autonomously.

Direct computer interaction capability
Can move mouse and type
Performs complex multi-step tasks
Computer use API in public beta
First frontier model with computer use

anthropicmodel-releaseclaudeagentcomputer-use

Sources

Claude 3.5 Sonnet and Computer Use

Major

Claude 3.5 Sonnet Released

2024-06-20

Anthropic released Claude 3.5 Sonnet, which outperformed Claude 3 Opus on many benchmarks while being faster and cheaper. It achieved 92% on HumanEval coding benchmark and demonstrated significant improvements in reasoning, coding, and following complex instructions.

Outperformed Claude 3 Opus
92% on HumanEval coding benchmark
2x faster than Claude 3 Opus
Better at following complex instructions
Artifacts feature for creating interactive content

anthropicmodel-releaseclaudellmcoding

Sources

Claude 3.5 Sonnet

Major

Claude 3 Model Family Released

2024-03-04

Anthropic released the Claude 3 family (Haiku, Sonnet, Opus), featuring near-human comprehension levels and vision capabilities. Claude 3 Opus set new benchmarks, outperforming GPT-4 on many tasks. All models featured improved speed, accuracy, and 200K context windows.

Three models: Haiku, Sonnet, Opus
Near-human comprehension on tasks
Vision and multimodal capabilities
200K token context window
Outperformed GPT-4 on many benchmarks

anthropicmodel-releaseclaudellmmultimodal

Sources

Claude 3 Model Family

Major

Claude 2 Released

2023-07-11

Anthropic released Claude 2, featuring 100K token context window, improved reasoning and coding capabilities, and strong safety measures. It achieved 71.2% on the Codex HumanEval coding benchmark and demonstrated Claude's focus on helpful, harmless, and honest AI.

100K token context window
71.2% on Codex HumanEval
Improved reasoning and coding
Safety via Constitutional AI
Public API availability

anthropicmodel-releaseclaudellmlong-context

Sources

Claude 2 Announcement