Claude Sonnet 5 Is Here: Everything You Need to Know (2026)
Anthropic launched Claude Sonnet 5 on June 30, 2026 — 1M context, 85.2% on SWE-bench, and launch pricing of $2/$10 per million tokens. Full breakdown here.
Short Answer
Anthropic launched Claude Sonnet 5 on June 30, 2026, its most agentic mid-tier model yet. It ships with a 1 million token context window, scores 85.2% on SWE-bench Verified, and launches at a discounted $2 input and $10 output per million tokens through August 31, 2026. It is available today on Claude.ai, the API, Claude Code, Bedrock, Vertex, Azure, Snowflake, and GitHub Copilot.
What Claude Sonnet 5 Actually Is
Claude Sonnet 5 is Anthropic's newest Sonnet-class model, the workhorse tier that sits between the budget Haiku models and the frontier Opus line. The headline is not raw intelligence; it is agency. Sonnet 5 is built to plan multi-step work, drive a browser or terminal on its own, recover from its own errors, and complete long tasks without a human supervising every turn.
That positioning matters because the way people use models has shifted. In 2024 you asked a model a question and read the answer. In 2026 you hand a model a goal, such as "refactor this service," "review these 40 contracts," or "book and reconcile these expenses," and expect it to execute. Sonnet 5 is Anthropic's bet that the mid-tier is where most of that agentic work will actually run, because it is cheap enough to leave running.
Anthropic explicitly calls it the new default for most teams, reserving Opus 4.8 for the 10 to 15% of tasks that need the deepest reasoning. For anyone tracking the Claude lineup, this is the most consequential release since the 5.0 generation began.
The Specs That Matter
| Spec | Claude Sonnet 5 |
|---|---|
| Model ID | claude-sonnet-5 |
| Release date | June 30, 2026 |
| Input context | 1,000,000 tokens |
| Max output | 64,000 tokens |
| Launch price (input) | $2 / million tokens |
| Launch price (output) | $10 / million tokens |
| Standard price (after Aug 31) | $3 / $15 per million |
| Batch API | 50% off on-demand |
| Prompt caching | Up to 90% off cached input |
The 1 million token context window is the same class as the frontier models, so you are not trading away context to save money. The 64,000 token maximum output is generous enough for full documents, large diffs, or long structured reports in a single call. Together, these specs make Sonnet 5 suitable for whole-codebase and whole-document work, not just short prompts.
Benchmarks: How Good Is It?
Sonnet 5's benchmark story is consistent: big jumps on agentic tasks, solid gains on pure coding.
| Benchmark | Sonnet 5 | Sonnet 4.6 | What it measures |
|---|---|---|---|
| SWE-bench Verified | 85.2% | 79.6% | Real GitHub issue fixes |
| SWE-bench Pro | 63.2% | 58.1% | Harder, multi-file fixes |
| Terminal-Bench 2.1 | 80.4% | 67.0% | Command-line task completion |
| BrowseComp (single agent) | 84.7% | ~60% | Web research and navigation |
| OSWorld-Verified | 81.2% | ~70% | Computer and desktop control |
| BigLaw Bench | 91.3% | — | Law-firm-grade legal tasks |
The gains of 13 to 25 points on Terminal-Bench, BrowseComp, and OSWorld are the real story. Those are the benchmarks that predict whether an autonomous agent will actually finish a job instead of getting stuck halfway. On raw software engineering, the 85.2% SWE-bench Verified score edges out even Opus 4.8 at 80.8%, while trailing the very top competitor models by a few points.
Why the agentic scores matter most
A model that scores well on single-shot coding but stalls on multi-step execution makes a poor agent. Sonnet 5's profile is the opposite: it is strong on single tasks and dramatically better at chaining them together. The 91.3% BigLaw Bench result also signals genuine capability on high-stakes professional work, which is why legal and financial firms were named among the first adopters.
Pricing and the August 31 Deadline
The pricing is the second headline. At launch, Sonnet 5 runs at $2 input and $10 output per million tokens, roughly a 33% discount off its standard rate. That promotional pricing ends August 31, 2026, after which it rises to $3 and $15.
For context, Opus 4.8 costs $5 input and $25 output, so Sonnet 5 delivers competitive, and in some cases higher, coding scores at roughly 40% of the price. If you are budgeting an agentic workload, this is the window to lock in usage patterns and benchmark your real costs before the promo expires. The full breakdown, including batch and caching math, is in our Claude Sonnet 5 pricing guide.
Where You Can Use It Today
Sonnet 5 launched everywhere at once, a deliberate move to make it the model you reach for by default wherever you already work.
- Claude.ai on web, iOS, and Android, including the free tier. It is the default model for most users.
- Claude API with model ID claude-sonnet-5.
- Claude Code, Anthropic's agentic coding CLI.
- Cloud platforms: AWS Bedrock, Google Cloud Vertex AI, Microsoft Azure AI Foundry, and Snowflake Cortex AI.
- GitHub Copilot, generally available from launch day.
This simultaneous, multi-platform release means enterprises can adopt Sonnet 5 through whatever billing and governance relationship they already have, removing a common barrier to trying a new model.
How to Read the Benchmarks: A Methodology Deep-Dive
Benchmark numbers are only useful if you understand what they actually measure, and each of Sonnet 5's headline scores maps to a different real-world capability. Treating them as a single "intelligence" figure is the most common mistake teams make when choosing a model.
SWE-bench Verified and SWE-bench Pro
SWE-bench Verified draws from real GitHub issues in open-source Python projects. The model is given a repository and an issue, and must produce a patch that makes the project's hidden test suite pass. It is a pass-or-fail measure of whether the model can fix a genuine bug in an existing codebase, which is why it is the most-cited coding benchmark. Sonnet 5's 85.2% means it resolves roughly six of every seven such issues unaided. SWE-bench Pro raises the difficulty with larger, multi-file changes and more subtle cross-dependencies, which is why every model scores lower on it; the 63.2% versus Opus 4.8's 69.2% is the clearest place Opus still leads.
Terminal-Bench, BrowseComp, and OSWorld
These three are the agentic benchmarks, and they matter most for anyone building autonomous workflows. Terminal-Bench measures whether a model can complete real command-line tasks, running commands, reading output, and recovering from errors. BrowseComp measures web research: navigating pages, following links, and extracting the right answer from the open web. OSWorld measures control of a real desktop environment, clicking, typing, and operating applications. A model can be a strong single-shot coder yet a poor agent if it stalls on these; Sonnet 5's 80.4%, 84.7%, and 81.2% are what qualify it as an agentic model rather than just a chat model.
Why a percentage-point gap may not matter
A 3 to 5 point benchmark difference sounds decisive but often is not, because benchmark suites contain a long tail of ambiguous or near-impossible cases that any model gets wrong. The practical question is not "which model has the highest number" but "which model reliably handles the specific kind of task I run." That is why the guidance throughout this cluster is to benchmark on your own workload rather than defer entirely to public leaderboards.
Enterprise Adoption: Three Case Studies
Anthropic and its partners highlighted concrete early use across three sectors, and each illustrates a different reason Sonnet 5 landed quickly.
Legal: Harvey
Legal AI platform Harvey offered Sonnet 5 on launch day, backed by the 91.3% BigLaw Bench score, a benchmark built specifically around law-firm tasks such as contract analysis, litigation document review, and transactional workflows. For a legal-tech product, accuracy on domain-specific work is the entire value proposition, and a model that surpasses previous Sonnet and Opus versions on legal tasks is an immediate upgrade. The lesson for other buyers: when a model posts a domain-specific benchmark above 90%, adoption in that domain tends to be fast because the quality bar is already met.
Finance: Norges Bank Investment Management
One of the world's largest institutional investors uses Claude for macro financial analysis, synthesizing market research and generating investment-grade analysis. Finance rewards a model that can hold enormous context, a full set of filings, reports, and data, and reason across all of it consistently. Sonnet 5's 1 million token window is a direct fit, and its low price makes running analysis at scale economical rather than a luxury.
Government: California
California announced a statewide deployment with a 50% discount and free workforce-training access for state agencies, one of the largest single public-sector AI adoptions of 2026. Government adoption signals two things to private buyers: that the model has cleared serious procurement and safety review, and that pricing at scale is negotiable. For risk-averse enterprises, a large public-sector deployment is a meaningful trust signal.
The common thread
Coding strength drew similar praise from tooling companies; a Cursor co-founder noted that Sonnet 5 "stays on plan, follows conventions, and ships clean multi-step changes," exactly the behavior that separates a usable coding agent from a frustrating one. Across legal, finance, government, and developer tooling, the pattern is the same: Sonnet 5 is capable enough for high-stakes work and cheap enough to deploy widely, which is precisely the combination that drives fast adoption.
Should You Switch?
If you are on Sonnet 4.6, the answer is almost certainly yes. Sonnet 5 is better on every benchmark and, during the promo, cheaper than 4.6's standard rate. There is no reason to stay.
If you are on Opus 4.8 for cost-sensitive, high-volume agentic work, test Sonnet 5 first, since you may recover most of the quality at a fraction of the cost. Keep Opus for the reasoning-heavy minority of tasks. Our model selection guide walks through exactly where each tier wins, and if you are coming from Claude 4.x generally and want the upgrade checklist, see the Claude 5.0 migration guide.
The Bottom Line
Claude Sonnet 5 is the clearest sign yet that the industry has moved from chat to agents. It combines frontier-class context, top-tier coding scores, standout agentic benchmarks, and aggressive pricing into a model designed to be left running on real work. For most teams and individual builders, it is the new default, and the launch-pricing window makes right now the ideal time to adopt it. If you are a non-technical user comparing the chat products rather than the APIs, start with Claude vs ChatGPT for non-coders.
Frequently Asked Questions
When was Claude Sonnet 5 released?
Anthropic released Claude Sonnet 5 on June 30, 2026, with same-day availability across Claude.ai, the API, Claude Code, and all major cloud platforms including Bedrock, Vertex, Azure, and Snowflake, plus GitHub Copilot. It is the default model for most Claude.ai users, so most people accessed it automatically on launch day.
How much does Claude Sonnet 5 cost?
Launch pricing is $2 per million input tokens and $10 per million output tokens through August 31, 2026, rising to $3 and $15 afterward. Batch processing halves those rates and prompt caching can save up to 90% on repeated input. Chat users on Claude.ai pay a flat subscription instead, from free up to $200 per month.
Is Claude Sonnet 5 better than Sonnet 4.6?
Yes, on every published benchmark, with the largest gains on agentic browser and terminal tasks where it improves 13 to 25 points. It scores 85.2% on SWE-bench Verified versus 79.6%, and 80.4% on Terminal-Bench versus 67.0%. During launch pricing it is also cheaper than Sonnet 4.6's standard rate, making the upgrade a straightforward decision.
What is the context window?
One million input tokens with up to 64,000 output tokens per response. That is large enough to load an entire mid-size codebase, a full set of contracts, or a long research corpus into a single prompt, which is why Sonnet 5 suits whole-codebase and whole-document work rather than only short interactions.
Is it free?
Yes, free-tier Claude.ai users can access Sonnet 5 with usage limits, and it is the default model for most users. Pro at $20 per month raises those limits, and Max plans at $100 and $200 per month suit heavy daily and agentic use. Per-token API pricing only applies when you build custom applications.
Should I upgrade?
From Sonnet 4.6, yes, immediately, since Sonnet 5 is strictly better and cheaper during the promo. From Opus 4.8, test Sonnet 5 for cost-sensitive agentic work, where it recovers most of the quality at around 40% of the price, and keep Opus configured as a fallback for the hardest reasoning-heavy tasks.
Ready to Start Practicing?
300+ scenario-based practice questions covering all 5 CCA domains. Detailed explanations for every answer.
Free CCA Study Kit
Get domain cheat sheets, anti-pattern flashcards, and weekly exam tips. No spam, unsubscribe anytime.