Claude Tutorials7 min read

Claude Sonnet 5 Pricing Guide 2026: Costs, Discounts & Calculator

Claude Sonnet 5 pricing explained: $2/$10 launch rate through Aug 31, batch and caching discounts, and real cost scenarios so you can budget before the promo ends.

Short Answer

Claude Sonnet 5 costs $2 per million input tokens and $10 per million output tokens at launch, through August 31, 2026. After that it rises to $3 and $15, a 33% increase. The Batch API halves those rates, and prompt caching can cut cached input by up to 90%. It is roughly 60% cheaper than Opus 4.8 for comparable coding quality, making it the value pick of the Claude lineup.

The Pricing Table

Understanding Claude Sonnet 5 pricing starts with the base rates and the two discounts that stack on top.

Launch (through Aug 31)Standard (from Sept 1)
Input (per M tokens)$2$3
Output (per M tokens)$10$15
Batch API input$1$1.50
Batch API output$5$7.50
Cached input (read)~$0.20~$0.30

The Batch API gives 50% off for work that can tolerate a delay, and prompt caching gives up to 90% off input tokens that repeat across calls. At scale, these two discounts matter far more than the base rate, and structuring your workload to use them is the core of cost control.

The August 31 Deadline Is the Real Story

The single most important fact for budgeting is that launch pricing ends August 31, 2026. From September 1, Sonnet 5 costs 50% more on both input and output.

This is not a reason to panic-migrate, but it is a reason to act deliberately.

  • Benchmark your real workload now, at launch pricing, so you know your true token consumption per task.
  • Model your September cost at $3/$15 before you commit to an architecture.
  • Build caching in early, because if 90% of your input is a stable system prompt and context, caching can more than offset the September price rise.
  • Teams that do this quietly lock in efficient patterns before the deadline; teams that ignore it get a surprise on their September invoice.

    Cost Scenarios: Do the Math Once

    Scenario A: Solo developer, coding agent

    At 100,000 input plus 20,000 output tokens per day:

    • Launch: (0.1 x $2) + (0.02 x $10) = $0.40 per day, about $12 per month
    • Standard: (0.1 x $3) + (0.02 x $15) = $0.60 per day, about $18 per month

    Effectively free. Individual developers should not think twice about the per-token cost.

    Scenario B: Small team or product feature

    At 200 million input plus 40 million output tokens per month:

    ModelMonthly cost (launch)
    Sonnet 5200 x $2 + 40 x $10 = $800
    Sonnet 5 standard200 x $3 + 40 x $15 = $1,200
    Opus 4.8200 x $5 + 40 x $25 = $2,000

    Choosing Sonnet 5 over Opus 4.8 here saves $1,200 per month at the same volume. See the full model comparison.

    Scenario C: High-volume batch processing

    At 500 million input plus 50 million output tokens per month, via the Batch API:

    • Batch launch: (500 x $1) + (50 x $5) = $750 per month
    • On-demand launch: (500 x $2) + (50 x $10) = $1,500 per month

    The Batch API alone saves $750 per month at this scale. Add caching on the input side and the effective cost drops further.

    How to Actually Lower Your Bill

    Three levers, in order of impact for agentic workloads.

    1. Prompt caching, the biggest lever

    Agents reuse the same long system prompt and context on every step. Cache it and you pay roughly 0.1x for that portion. For a chatty multi-step agent, this can cut total input cost by 70 to 90%. If you do one thing to control costs, make your stable context cacheable.

    2. The Batch API

    Any workload that does not need a real-time response, such as overnight document processing, bulk classification, or evaluation runs, should go through Batch for an automatic 50% cut. Many teams leave this money on the table simply because they default to on-demand.

    3. Model routing

    Send simple sub-tasks to Haiku 4.5 at $1/$5 and only escalate to Sonnet 5 when needed. Our cost optimization guide covers routing patterns in depth, including how to decide the escalation threshold.

    What About Chat Subscriptions?

    The per-token pricing above is for the API. If you use Sonnet 5 through Claude.ai as a chat assistant, you pay a flat subscription instead:

    PlanPriceBest for
    Free$0Light use, trying it out
    Pro$20/monthDaily professional use
    Max$100 and $200/monthHeavy and agentic use

    Most individual professionals never touch per-token pricing; they simply pay the subscription. Per-token costs matter when you build custom applications on the API.

    Is Sonnet 5 Worth It vs Competitors?

    Against frontier competitor models priced around $5/$30, Sonnet 5 at $2/$10 is roughly 60% cheaper, and it produces code that developers widely describe as cleaner and better-commented even where it trails by a few benchmark points. For price-to-performance in 2026, it is hard to beat. The full launch context is in everything you need to know about Claude Sonnet 5.

    The Bottom Line

    Claude Sonnet 5 is priced to be the default. The base rates are already low, the launch discount makes the next two months cheaper still, and batch plus caching can cut real costs by more than half again. The action for any team is clear: benchmark now at launch pricing, build caching in from the start, and route trivial work to Haiku, so your September bill reflects an efficient architecture rather than a naive one.

    Frequently Asked Questions

    How much does Claude Sonnet 5 cost per million tokens?

    $2 input and $10 output at launch, through August 31, 2026, then $3 and $15 standard, a roughly 33% increase. Batch processing halves those rates and prompt caching can cut repeated input by up to 90%. So while the headline number is $2/$10, the effective cost for a well-structured high-volume workload can be considerably lower once discounts are applied.

    When does launch pricing end?

    August 31, 2026. Standard pricing of $3/$15 per million tokens applies from September 1, with no automatic grandfathering. The practical move is to benchmark your real workload now at launch pricing, model your September cost in advance, and build prompt caching in early, since caching can more than offset the price rise for workloads with a large stable context.

    How do I reduce costs?

    Three levers: prompt caching for up to 90% off repeated input, the Batch API for 50% off non-urgent work, and routing simple tasks to Haiku 4.5. Caching is the biggest lever for agents because they resend the same context every step. Structuring your long system prompt and context to be cacheable is the single most effective cost optimization for high-volume agentic use.

    Is it cheaper than Opus 4.8?

    Yes, about 60% cheaper on both input and output. Sonnet 5 is $2/$10 at launch, or $3/$15 standard, versus Opus 4.8 at $5/$25, while delivering comparable or better coding scores for most tasks. That combination of lower price and strong quality is exactly why Anthropic positions Sonnet 5 as the new default, with Opus reserved for the hardest reasoning work.

    What does a coding agent cost?

    A developer at about 100,000 input and 20,000 output tokens per day pays roughly $0.40 daily at launch pricing, essentially free. A small team at 200 million input and 40 million output tokens per month pays about $800 monthly at launch, before batch and caching discounts. Those discounts can cut the real figure substantially for workloads with heavy context reuse.

    Does chat use cost per token?

    No. Per-token pricing applies only to the API. On Claude.ai, chat is a flat subscription: free with limits, Pro at $20 per month, and Max at $100 and $200 per month. Most individual professionals only pay the subscription and never see a per-token bill. Per-token costs come into play when you build custom applications on the Claude API.

    Ready to Start Practicing?

    300+ scenario-based practice questions covering all 5 CCA domains. Detailed explanations for every answer.

    Free CCA Study Kit

    Get domain cheat sheets, anti-pattern flashcards, and weekly exam tips. No spam, unsubscribe anytime.