claude-news8 min read

Claude Advisor Tool: Get Opus-Level Intelligence at Sonnet Prices (2026)

The Claude advisor tool lets Sonnet or Haiku consult Opus mid-task—cutting cost by up to 85% while boosting benchmark scores. Complete setup guide with code examples.

Claude Advisor Tool: Get Opus-Level Intelligence at Sonnet Prices

Every developer building agentic AI apps hits the same wall: Opus is brilliant but expensive; Sonnet and Haiku are cheap but miss edge cases that tank production reliability. For months, the only fix was to either accept the cost or accept the quality compromise.

Anthropic just broke that tradeoff.

The Claude advisor tool—now in public beta as of April 2026—lets a fast, low-cost executor model (Sonnet 4.6 or Haiku 4.5) silently escalate to Claude Opus mid-task whenever it needs strategic guidance. The result: Opus-grade reasoning at the critical decision points, Sonnet-grade pricing for everything else.

Here's everything you need to know to ship it today.

What Is the Claude Advisor Tool?

The advisor tool is a new API primitive that pairs two models on a single task:

  • Executor model — Sonnet 4.6 or Haiku 4.5. Does all the tool calls, reads outputs, generates responses. Billed at standard Sonnet/Haiku rates.
  • Advisor model — Claude Opus 4.6 (or 4.7). Sees the full transcript at key junctures and provides strategic guidance. Generates ~400–700 tokens per consultation. Billed at Opus rates only for those tokens.

Anthropic runs the advisor inference server-side. When the executor calls the advisor tool, Anthropic passes the full conversation context—system prompt, all tool definitions, all prior turns and tool results—to Opus. Opus replies with a concise recommendation. The executor then continues acting on that guidance.

The key insight: most tokens in an agentic task don't require Opus-level reasoning. Fetching search results, parsing JSON, calling APIs—Sonnet handles all of that fine. Opus only needs to weigh in at decision forks: "which approach do I take here?", "does this plan make sense?", "how do I recover from this error?"

How the Advisor Pattern Actually Works

Think of it like a junior analyst working alongside a senior strategist. The analyst does the legwork—running queries, gathering data, drafting outputs. The strategist gets consulted at key moments where experience matters. You're not paying the strategist's hourly rate for every spreadsheet cell.

In technical terms:

  • You send a request to the Messages API with your executor model (e.g., claude-sonnet-4-6)
  • You include the advisor tool in the tools array
  • The executor runs normally. When it hits a decision that warrants escalation, it calls the advisor tool
  • Anthropic spins up an Opus inference pass with the full context
  • Opus returns a recommendation (just text—it doesn't make tool calls)
  • The executor receives the guidance as a tool_result and continues the task
  • The executor controls when to escalate. You can cap consultations per run with max_uses to keep costs predictable.

    Setting It Up: Complete Code Example

    Access is open to all API users—no waitlist, no application. You just need the beta header.

    Python

    pythonimport anthropic
    
    client = anthropic.Anthropic()
    
    response = client.beta.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=8096,
        betas=["advisor-tool-2026-03-01"],
        tools=[
            {
                "type": "advisor_20260301",
                "name": "advisor",
                "model": "claude-opus-4-6",
                "max_uses": 3,  # cap Opus consultations per run
            },
            # your other tools here (web_search, bash, etc.)
        ],
        system="You are a software engineering agent. Use the advisor tool when you need strategic guidance on architecture decisions or complex debugging.",
        messages=[
            {
                "role": "user",
                "content": "Refactor this authentication module to support OAuth2 without breaking the existing session-based flow."
            }
        ]
    )
    
    print(response.content)

    TypeScript

    typescriptimport Anthropic from "@anthropic-ai/sdk";
    
    const client = new Anthropic();
    
    const response = await client.beta.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 8096,
      betas: ["advisor-tool-2026-03-01"],
      tools: [
        {
          type: "advisor_20260301",
          name: "advisor",
          model: "claude-opus-4-6",
          max_uses: 3,
        },
        // your domain tools
      ],
      system:
        "You are a code review agent. Consult the advisor for architectural decisions and security-sensitive changes.",
      messages: [
        {
          role: "user",
          content: "Review this PR diff and flag any issues.",
        },
      ],
    });

    The max_uses parameter is your cost control lever. Set it to 1 for lightweight tasks where you just want a single sanity check. Go up to 5 for complex multi-step workflows with multiple branching decisions. The advisor token usage appears in response.usage broken out by model, so you can monitor it precisely.

    Real-World Benchmarks: What the Numbers Actually Show

    Anthropic published internal benchmark results at launch. The gains are meaningful—especially for smaller models.

    SetupSWE-bench MultilingualBrowseComp
    Sonnet 4.6 alone72.1%
    Sonnet 4.6 + Opus advisor74.8% (+2.7 pts)
    Haiku 4.5 alone19.7%
    Haiku 4.5 + Opus advisor41.2% (+21.5 pts)

    The Haiku result is striking: +21.5 percentage points on BrowseComp just by adding an Opus advisor. BrowseComp is a web research task that requires synthesizing information across many pages and making judgment calls—exactly the scenario where big-model reasoning at key junctures pays off.

    Overall cost per agentic task dropped 11.9% in Anthropic's internal testing. In practice, savings will vary by use case. Tasks with lots of straightforward tool calls (file reads, API calls, search queries) see bigger savings. Tasks that are almost entirely strategic reasoning may not benefit as much.

    When to Use the Advisor Tool (and When Not To)

    Use it for:
    • Long-running coding agents — Haiku/Sonnet handles boilerplate; Opus reviews architecture decisions, security-sensitive changes, and recovery from errors
    • Research agents — Sonnet browses and retrieves; Opus synthesizes findings and decides what's worth pursuing
    • Customer support automations — Haiku handles routine queries; Opus escalates intelligently on edge cases
    • Data pipeline agents — Sonnet transforms data; Opus decides schema strategy and handles ambiguous cases

    Skip it for:
    • Simple single-turn completions — If you're not running a multi-step agent, the advisor adds overhead without benefit
    • Tasks that are already purely strategic — If the entire task is a complex reasoning problem, just use Opus directly
    • Latency-critical paths — Each advisor consultation adds a round-trip Opus inference. For sub-second response requirements, the added latency may not be acceptable

    Cost Breakdown: What You Actually Pay

    The billing model is straightforward:

    • Executor tokens — billed at the executor model's rate (Sonnet 4.6 or Haiku 4.5)
    • Advisor tokens — billed at Claude Opus 4.6 rates, but only for the ~400–700 tokens per consultation

    If your agent runs 1,000 tokens of Sonnet work and calls the advisor twice at ~500 tokens each, you pay: 1,000 × Sonnet rate + 1,000 × Opus rate. Given Opus input costs roughly 15× Haiku and 3× Sonnet, the math works heavily in your favor for any task where the advisor consults represent less than ~20% of total token volume.

    For a typical agentic coding task (5,000 executor tokens, 3 advisor consultations at 600 tokens each):

    • Without advisor — 5,000 × Sonnet rate
    • With advisor — 5,000 × Sonnet rate + 1,800 × Opus rate

    On current pricing, that's roughly a 12–18% cost increase for meaningful accuracy gains. For production workloads where hallucinations or bad decisions cost real money, that's a strong ROI.

    Connecting This to Claude Certification

    If you're preparing for the Claude Certified Architect (CCA-F) exam, the advisor pattern is exactly the kind of architectural decision-making that appears in scenario questions. Knowing when to use Opus vs. Sonnet vs. Haiku—and how to structure multi-model agent systems—is tested directly.

    The advisor tool also surfaces important concepts around:

    • Agentic system design — how to decompose tasks across model tiers
    • Cost vs. capability tradeoffs — a core CCA topic
    • Tool use and orchestration — the advisor tool is itself a tool in the API sense
    • Beta feature adoption — understanding Anthropic's release cadence and beta headers

    Key Takeaways

    • The Claude advisor tool pairs a cheap executor (Sonnet/Haiku) with Opus as an on-demand consultant, cutting cost without sacrificing accuracy at critical decision points
    • Enable it with the anthropic-beta: advisor-tool-2026-03-01 header; add the advisor_20260301 tool to your tools array; no waitlist needed
    • Haiku with Opus advisor jumped from 19.7% → 41.2% on BrowseComp—a 21.5 point gain from strategic guidance alone
    • Use max_uses to cap advisor consultations and keep costs predictable
    • Best for long-running agents with a mix of routine operations and strategic decision points

    Next Steps

    Ready to build with the advisor tool? Start with Anthropic's official advisor tool documentation for the full API reference. Preparing for the Claude Certified Architect exam? The CCA-F covers multi-model system design, agentic patterns, and cost optimization—all of which the advisor tool tests in practice. Check out our CCA practice test bank with 200+ questions built around real exam scenarios. Want to go deeper on Claude API architecture? Our Claude API beginner's guide covers the full API surface, including beta features, streaming, and tool use from scratch.
    Sources: Anthropic advisor tool docs · The Advisor Strategy — Anthropic blog · Builder.io: Claude Advisor API · MindwiredAI: Cut Claude API costs by 85%

    Ready to Start Practicing?

    300+ scenario-based practice questions covering all 5 CCA domains. Detailed explanations for every answer.

    Free CCA Study Kit

    Get domain cheat sheets, anti-pattern flashcards, and weekly exam tips. No spam, unsubscribe anytime.