When does Claude Sonnet 5 launch pricing end?

The introductory pricing of $2/$10 per million tokens ends on August 31, 2026. From September 1, 2026, Sonnet 5 costs $3/$15 per million tokens. There is no automatic grandfathering, so usage after the deadline is billed at standard rates. Benchmark your real costs before then to plan your budget.

How can I reduce Claude Sonnet 5 costs?

Use the Batch API for non-urgent work to cut prices in half, and use prompt caching to save up to 90% on repeated input tokens, which bill at roughly 0.1x the base rate. Structuring long, reused system prompts and context to be cacheable is the single biggest lever for high-volume agentic workloads.

Is Claude Sonnet 5 cheaper than Opus 4.8?

Yes, significantly. Sonnet 5 costs $2/$10 per million tokens at launch, or $3/$15 standard, versus Opus 4.8 at $5/$25. That makes Sonnet 5 roughly 60% cheaper on input and output while delivering comparable or better coding benchmark scores for most tasks, which is why Anthropic positions it as the new default.

How much does it cost to run a coding agent on Sonnet 5?

A developer running about 100,000 input and 20,000 output tokens per day pays roughly $0.40 per day at launch pricing, or $0.60 at standard rates. Scaled to a small team at 200 million input and 40 million output tokens per month, that is about $800 per month at launch pricing, before batch or caching discounts.

Claude Sonnet 5 Pricing Guide 2026: Costs, Discounts & Calculator

Q: How much does Claude Sonnet 5 cost per million tokens?

Through August 31, 2026, launch pricing is $2 per million input tokens and $10 per million output tokens. After the promo ends, standard pricing is $3 input and $15 output per million tokens, a roughly 33% increase. Batch and caching discounts stack on top and can lower effective cost dramatically.

Q: Does chat use cost per token?

No. Per-token pricing applies to the API. On Claude.ai, chat use is a flat subscription: free with limits, Pro at $20 per month, and Max at $100 and $200 per month. Most individual professionals only ever pay the subscription; per-token costs apply when you build custom applications on the API.

Short Answer

Claude Sonnet 5 costs $2 per million input tokens and $10 per million output tokens at launch, through August 31, 2026. After that it rises to $3 and $15, a 33% increase. The Batch API halves those rates, and prompt caching can cut cached input by up to 90%. It is roughly 60% cheaper than Opus 4.8 for comparable coding quality, making it the value pick of the Claude lineup.

The Pricing Table

Understanding Claude Sonnet 5 pricing starts with the base rates and the two discounts that stack on top.

Launch (through Aug 31)	Standard (from Sept 1)
Input (per M tokens)	$2	$3
Output (per M tokens)	$10	$15
Batch API input	$1	$1.50
Batch API output	$5	$7.50
Cached input (read)	~$0.20	~$0.30

The Batch API gives 50% off for work that can tolerate a delay, and prompt caching gives up to 90% off input tokens that repeat across calls. At scale, these two discounts matter far more than the base rate, and structuring your workload to use them is the core of cost control.

The August 31 Deadline Is the Real Story

The single most important fact for budgeting is that launch pricing ends August 31, 2026. From September 1, Sonnet 5 costs 50% more on both input and output.

This is not a reason to panic-migrate, but it is a reason to act deliberately.

Benchmark your real workload now, at launch pricing, so you know your true token consumption per task.

Model your September cost at $3/$15 before you commit to an architecture.

Build caching in early, because if 90% of your input is a stable system prompt and context, caching can more than offset the September price rise.

Teams that do this quietly lock in efficient patterns before the deadline; teams that ignore it get a surprise on their September invoice.

Cost Scenarios: Do the Math Once

Scenario A: Solo developer, coding agent

At 100,000 input plus 20,000 output tokens per day:

Launch: (0.1 x $2) + (0.02 x $10) = $0.40 per day, about $12 per month
Standard: (0.1 x $3) + (0.02 x $15) = $0.60 per day, about $18 per month

Effectively free. Individual developers should not think twice about the per-token cost.

Scenario B: Small team or product feature

At 200 million input plus 40 million output tokens per month:

Model	Monthly cost (launch)
Sonnet 5	200 x $2 + 40 x $10 = $800
Sonnet 5 standard	200 x $3 + 40 x $15 = $1,200
Opus 4.8	200 x $5 + 40 x $25 = $2,000

Choosing Sonnet 5 over Opus 4.8 here saves $1,200 per month at the same volume. See the full model comparison.

Scenario C: High-volume batch processing

At 500 million input plus 50 million output tokens per month, via the Batch API:

Batch launch: (500 x $1) + (50 x $5) = $750 per month
On-demand launch: (500 x $2) + (50 x $10) = $1,500 per month

The Batch API alone saves $750 per month at this scale. Add caching on the input side and the effective cost drops further.

How to Actually Lower Your Bill

Three levers, in order of impact for agentic workloads.

1. Prompt caching, the biggest lever

Agents reuse the same long system prompt and context on every step. Cache it and you pay roughly 0.1x for that portion. For a chatty multi-step agent, this can cut total input cost by 70 to 90%. If you do one thing to control costs, make your stable context cacheable.

2. The Batch API

Any workload that does not need a real-time response, such as overnight document processing, bulk classification, or evaluation runs, should go through Batch for an automatic 50% cut. Many teams leave this money on the table simply because they default to on-demand.

3. Model routing

Send simple sub-tasks to Haiku 4.5 at $1/$5 and only escalate to Sonnet 5 when needed. Our cost optimization guide covers routing patterns in depth, including how to decide the escalation threshold.

What About Chat Subscriptions?

The per-token pricing above is for the API. If you use Sonnet 5 through Claude.ai as a chat assistant, you pay a flat subscription instead:

Plan	Price	Best for
Free	$0	Light use, trying it out
Pro	$20/month	Daily professional use
Max	$100 and $200/month	Heavy and agentic use

Most individual professionals never touch per-token pricing; they simply pay the subscription. Per-token costs matter when you build custom applications on the API.

Is Sonnet 5 Worth It vs Competitors?

Against frontier competitor models priced around $5/$30, Sonnet 5 at $2/$10 is roughly 60% cheaper, and it produces code that developers widely describe as cleaner and better-commented even where it trails by a few benchmark points. For price-to-performance in 2026, it is hard to beat. The full launch context is in everything you need to know about Claude Sonnet 5.

The Bottom Line

Claude Sonnet 5 is priced to be the default. The base rates are already low, the launch discount makes the next two months cheaper still, and batch plus caching can cut real costs by more than half again. The action for any team is clear: benchmark now at launch pricing, build caching in from the start, and route trivial work to Haiku, so your September bill reflects an efficient architecture rather than a naive one.

Frequently Asked Questions

How much does Claude Sonnet 5 cost per million tokens?

$2 input and $10 output at launch, through August 31, 2026, then $3 and $15 standard, a roughly 33% increase. Batch processing halves those rates and prompt caching can cut repeated input by up to 90%. So while the headline number is $2/$10, the effective cost for a well-structured high-volume workload can be considerably lower once discounts are applied.

When does launch pricing end?

August 31, 2026. Standard pricing of $3/$15 per million tokens applies from September 1, with no automatic grandfathering. The practical move is to benchmark your real workload now at launch pricing, model your September cost in advance, and build prompt caching in early, since caching can more than offset the price rise for workloads with a large stable context.

How do I reduce costs?

Three levers: prompt caching for up to 90% off repeated input, the Batch API for 50% off non-urgent work, and routing simple tasks to Haiku 4.5. Caching is the biggest lever for agents because they resend the same context every step. Structuring your long system prompt and context to be cacheable is the single most effective cost optimization for high-volume agentic use.

Is it cheaper than Opus 4.8?

Yes, about 60% cheaper on both input and output. Sonnet 5 is $2/$10 at launch, or $3/$15 standard, versus Opus 4.8 at $5/$25, while delivering comparable or better coding scores for most tasks. That combination of lower price and strong quality is exactly why Anthropic positions Sonnet 5 as the new default, with Opus reserved for the hardest reasoning work.

What does a coding agent cost?

A developer at about 100,000 input and 20,000 output tokens per day pays roughly $0.40 daily at launch pricing, essentially free. A small team at 200 million input and 40 million output tokens per month pays about $800 monthly at launch, before batch and caching discounts. Those discounts can cut the real figure substantially for workloads with heavy context reuse.

Does chat use cost per token?

No. Per-token pricing applies only to the API. On Claude.ai, chat is a flat subscription: free with limits, Pro at $20 per month, and Max at $100 and $200 per month. Most individual professionals only pay the subscription and never see a per-token bill. Per-token costs come into play when you build custom applications on the Claude API.