Anthropic · 2025 · 05 · 22 · Model · ~1 min read

Anthropic launched Claude 4 (Sonnet 4 + Opus 4)

Anthropic's next major Claude family. Sonnet 4 (the daily driver) and Opus 4 (the heavy hitter, available again after a year off). Big jumps on long, agent-style coding tasks — Opus 4 ran a coding task autonomously for several hours in their demo.

What's actually new

Opus 4 is back. The top tier had been missing in 2024. Now it's the strongest general-purpose model on most tests.
Long-running agent runs. Opus 4 stayed on a coding task for hours in the demo, then handed off properly.
Better at sticking to instructions in long sessions — drift over multi-hour conversations dropped substantially.

If you want more

Worth knowing~30s

Opus 4 is expensive. $15 in / $75 out per million tokens makes long agent runs cost real money.
Pinned 'best model on the planet' claims faded fast. OpenAI shipped GPT-5 within ~3 months.

Who should care~20s

Developers shipping AI agents with Anthropic. Anyone evaluating coding assistants seriously. Researchers tracking long-context agent reliability. Teams that paused on agent products waiting for something more reliable.

What to do about it~20s

If you've been holding off building an agent because models lose context after an hour, try Opus 4 on your task. Set a budget cap before you start.

Honest take~45s

Claude 4 was the moment AI agents felt genuinely usable for multi-hour work, not just 5-minute demos. Opus 4's return ended the awkward 'Anthropic's top model is missing' year. The 'agent' wave that everyone had been promising for two years finally had a workhorse model behind it. The question turned from 'can the AI run for hours?' to 'how do we trust what it did?' — a much better problem to have.

Other recent model updates

Sources

Anthropic — Introducing Claude 4vendor
Simon Willison — Claude 4 first impressionsthird party
Artificial Analysis — Claude 4 testingbenchmark

Last verified · 2026 · 05 · 05 · Found a fact wrong? corrections@aguidetocloud.com