Claude Advisor Tool: Get Opus-Level Intelligence at Sonnet Prices

Every developer building agentic AI apps hits the same wall: Opus is brilliant but expensive; Sonnet and Haiku are cheap but miss edge cases that tank production reliability. For months, the only fix was to either accept the cost or accept the quality compromise.

Anthropic just broke that tradeoff.

The Claude advisor tool—now in public beta as of April 2026—lets a fast, low-cost executor model (Sonnet 4.6 or Haiku 4.5) silently escalate to Claude Opus mid-task whenever it needs strategic guidance. The result: Opus-grade reasoning at the critical decision points, Sonnet-grade pricing for everything else.

Here's everything you need to know to ship it today.

What Is the Claude Advisor Tool?

The advisor tool is a new API primitive that pairs two models on a single task:

Executor model — Sonnet 4.6 or Haiku 4.5. Does all the tool calls, reads outputs, generates responses. Billed at standard Sonnet/Haiku rates.
Advisor model — Claude Opus 4.6 (or 4.7). Sees the full transcript at key junctures and provides strategic guidance. Generates ~400–700 tokens per consultation. Billed at Opus rates only for those tokens.

Anthropic runs the advisor inference server-side. When the executor calls the advisor tool, Anthropic passes the full conversation context—system prompt, all tool definitions, all prior turns and tool results—to Opus. Opus replies with a concise recommendation. The executor then continues acting on that guidance.

The key insight: most tokens in an agentic task don't require Opus-level reasoning. Fetching search results, parsing JSON, calling APIs—Sonnet handles all of that fine. Opus only needs to weigh in at decision forks: "which approach do I take here?", "does this plan make sense?", "how do I recover from this error?"

How the Advisor Pattern Actually Works

Think of it like a junior analyst working alongside a senior strategist. The analyst does the legwork—running queries, gathering data, drafting outputs. The strategist gets consulted at key moments where experience matters. You're not paying the strategist's hourly rate for every spreadsheet cell.

In technical terms:

You send a request to the Messages API with your executor model (e.g., claude-sonnet-4-6)

You include the advisor tool in the tools array

The executor runs normally. When it hits a decision that warrants escalation, it calls the advisor tool

Anthropic spins up an Opus inference pass with the full context

Opus returns a recommendation (just text—it doesn't make tool calls)

The executor receives the guidance as a tool_result and continues the task

The executor controls when to escalate. You can cap consultations per run with max_uses to keep costs predictable.

Setting It Up: Complete Code Example

Access is open to all API users—no waitlist, no application. You just need the beta header.

Python

pythonimport anthropic

client = anthropic.Anthropic()

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=8096,
    betas=["advisor-tool-2026-03-01"],
    tools=[
        {
            "type": "advisor_20260301",
            "name": "advisor",
            "model": "claude-opus-4-6",
            "max_uses": 3,  # cap Opus consultations per run
        },
        # your other tools here (web_search, bash, etc.)
    ],
    system="You are a software engineering agent. Use the advisor tool when you need strategic guidance on architecture decisions or complex debugging.",
    messages=[
        {
            "role": "user",
            "content": "Refactor this authentication module to support OAuth2 without breaking the existing session-based flow."
        }
    ]
)

print(response.content)

TypeScript

typescriptimport Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const response = await client.beta.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 8096,
  betas: ["advisor-tool-2026-03-01"],
  tools: [
    {
      type: "advisor_20260301",
      name: "advisor",
      model: "claude-opus-4-6",
      max_uses: 3,
    },
    // your domain tools
  ],
  system:
    "You are a code review agent. Consult the advisor for architectural decisions and security-sensitive changes.",
  messages: [
    {
      role: "user",
      content: "Review this PR diff and flag any issues.",
    },
  ],
});

The max_uses parameter is your cost control lever. Set it to 1 for lightweight tasks where you just want a single sanity check. Go up to 5 for complex multi-step workflows with multiple branching decisions. The advisor token usage appears in response.usage broken out by model, so you can monitor it precisely.

Real-World Benchmarks: What the Numbers Actually Show

Anthropic published internal benchmark results at launch. The gains are meaningful—especially for smaller models.

Setup	SWE-bench Multilingual	BrowseComp
Sonnet 4.6 alone	72.1%	—
Sonnet 4.6 + Opus advisor	74.8% (+2.7 pts)	—
Haiku 4.5 alone	—	19.7%
Haiku 4.5 + Opus advisor	—	41.2% (+21.5 pts)

The Haiku result is striking: +21.5 percentage points on BrowseComp just by adding an Opus advisor. BrowseComp is a web research task that requires synthesizing information across many pages and making judgment calls—exactly the scenario where big-model reasoning at key junctures pays off.

Overall cost per agentic task dropped 11.9% in Anthropic's internal testing. In practice, savings will vary by use case. Tasks with lots of straightforward tool calls (file reads, API calls, search queries) see bigger savings. Tasks that are almost entirely strategic reasoning may not benefit as much.

When to Use the Advisor Tool (and When Not To)

Use it for:

Long-running coding agents — Haiku/Sonnet handles boilerplate; Opus reviews architecture decisions, security-sensitive changes, and recovery from errors
Research agents — Sonnet browses and retrieves; Opus synthesizes findings and decides what's worth pursuing
Customer support automations — Haiku handles routine queries; Opus escalates intelligently on edge cases
Data pipeline agents — Sonnet transforms data; Opus decides schema strategy and handles ambiguous cases

Skip it for:

Simple single-turn completions — If you're not running a multi-step agent, the advisor adds overhead without benefit
Tasks that are already purely strategic — If the entire task is a complex reasoning problem, just use Opus directly
Latency-critical paths — Each advisor consultation adds a round-trip Opus inference. For sub-second response requirements, the added latency may not be acceptable

Cost Breakdown: What You Actually Pay

The billing model is straightforward:

Executor tokens — billed at the executor model's rate (Sonnet 4.6 or Haiku 4.5)
Advisor tokens — billed at Claude Opus 4.6 rates, but only for the ~400–700 tokens per consultation

If your agent runs 1,000 tokens of Sonnet work and calls the advisor twice at ~500 tokens each, you pay: 1,000 × Sonnet rate + 1,000 × Opus rate. Given Opus input costs roughly 15× Haiku and 3× Sonnet, the math works heavily in your favor for any task where the advisor consults represent less than ~20% of total token volume.

For a typical agentic coding task (5,000 executor tokens, 3 advisor consultations at 600 tokens each):

Without advisor — 5,000 × Sonnet rate
With advisor — 5,000 × Sonnet rate + 1,800 × Opus rate

On current pricing, that's roughly a 12–18% cost increase for meaningful accuracy gains. For production workloads where hallucinations or bad decisions cost real money, that's a strong ROI.

Connecting This to Claude Certification

If you're preparing for the Claude Certified Architect (CCA-F) exam, the advisor pattern is exactly the kind of architectural decision-making that appears in scenario questions. Knowing when to use Opus vs. Sonnet vs. Haiku—and how to structure multi-model agent systems—is tested directly.

The advisor tool also surfaces important concepts around:

Agentic system design — how to decompose tasks across model tiers
Cost vs. capability tradeoffs — a core CCA topic
Tool use and orchestration — the advisor tool is itself a tool in the API sense
Beta feature adoption — understanding Anthropic's release cadence and beta headers

Key Takeaways

The Claude advisor tool pairs a cheap executor (Sonnet/Haiku) with Opus as an on-demand consultant, cutting cost without sacrificing accuracy at critical decision points
Enable it with the anthropic-beta: advisor-tool-2026-03-01 header; add the advisor_20260301 tool to your tools array; no waitlist needed
Haiku with Opus advisor jumped from 19.7% → 41.2% on BrowseComp—a 21.5 point gain from strategic guidance alone
Use max_uses to cap advisor consultations and keep costs predictable
Best for long-running agents with a mix of routine operations and strategic decision points

Next Steps

Ready to build with the advisor tool? Start with Anthropic's official advisor tool documentation for the full API reference. Preparing for the Claude Certified Architect exam? The CCA-F covers multi-model system design, agentic patterns, and cost optimization—all of which the advisor tool tests in practice. Check out our CCA practice test bank with 200+ questions built around real exam scenarios. Want to go deeper on Claude API architecture? Our Claude API beginner's guide covers the full API surface, including beta features, streaming, and tool use from scratch.

Sources: Anthropic advisor tool docs · The Advisor Strategy — Anthropic blog · Builder.io: Claude Advisor API · MindwiredAI: Cut Claude API costs by 85%

Claude Advisor Tool: Get Opus-Level Intelligence at Sonnet Prices (2026)