There's a frustrating moment every AI coding assistant user knows. The good models are expensive, and the cheaper ones stall out on harder problems. Anthropic just tackled that dilemma head-on.
Claude Code's /advisor command lets the execution model (Sonnet) automatically consult a more capable model (Opus) whenever it hits a wall. Most tokens are handled by the cheaper model — the expensive one only steps in for the hard judgment calls — so you get better results without blowing your budget. On SWE-bench, Sonnet alone scored 72.1%; Sonnet + Opus Advisor hit 74.8%, while actually cutting per-task cost by 11.9%.
What Is It?
/advisor is a model pairing pattern that Anthropic officially released on the Claude platform on April 9, 2026. The traditional approach had the expensive model (Opus) issuing instructions from the top while the cheaper model (Sonnet/Haiku) handled execution — the classic subagent pattern. /advisor flips that completely.
Here's how it works. A fast, affordable model like Sonnet 4.6 or Haiku 4.5 leads the task from start to finish — reading files, writing code, running tests. All of that stays with the execution model. When it runs into a decision it can't confidently make on its own, it automatically reaches out to Opus 4.6 for advice.
Key Takeaway: Opus sees the full conversation context, then sends back a short strategic recommendation — roughly 400–700 tokens. It can't write code, modify files, or call external tools. Its only job is to think.
In Claude Code, just type /advisor, select Opus 4.6, and you're done. No extra configuration, no parameter tuning — one line and it's active. On the API side, you add a single advisor_20260301 type to your tools array.
What's most impressive is that advice accumulates across the session context. When the Advisor gets called again later, it references its earlier recommendations too. If the execution model gets a result that contradicts the advice, it explicitly flags it: "The Advisor said X, but the test came back Y — should I check in again?"
What Changes?
Here's how the old multi-model pattern stacks up against /advisor.
| Traditional Subagent Pattern | /advisor Pattern | |
|---|---|---|
| Structure | Expensive model directs from above; cheap model executes | Cheap model leads; expensive model consults only when needed |
| Cost structure | Most cost concentrated on the orchestrator model | Execution model rates dominate; Advisor uses only a small token count |
| Context sharing | Developer manages context passing manually | Handled automatically server-side |
| Orchestration | Requires frameworks like LangGraph or AutoGen | One line of API config |
| Advisor tool access | Worker can use tools | Advisor has no tool access — reasoning only |
| Best fit | Tasks where every turn is complex | Tasks that are mostly mechanical with occasional hard decisions |
What Do the Benchmarks Show?
Let's look at the numbers.
Haiku's jump is the real story here. Going from 19.7% to 41.2% on BrowseComp isn't a minor improvement — it means Advisor is genuinely closing a fundamental capability gap in complex web research tasks. Eve Legal, a legal document extraction service, reportedly combined Haiku + Opus Advisor to hit frontier-model quality at one-fifth the cost of running Opus alone.
Getting Started
- Start directly in Claude Code
Type/advisorin your terminal → select Opus 4.6 → done. If you're already using Sonnet 4.6 as your main model, give it a try right now. - Connect via the API
Add the beta headeranthropic-beta: advisor-tool-2026-03-01to your Messages API request, then include{"type": "advisor_20260301", "name": "advisor", "model": "claude-opus-4-6"}in your tools array. - Set up cost controls
Use themax_usesparameter to cap Advisor calls per request. For coding tasks, 2–3 calls is typically enough to cover the key decision points. - Pick your model pairing
Best value: Sonnet + Opus Advisor. Ultra-budget: Haiku + Opus Advisor (great for high-volume processing). If Opus is already your main model, combining it with a subagent workflow makes more sense. - Use prompting to guide call timing
Add explicit instructions to your system prompt — something like "check with the advisor before doing substantive work" or "double-check with the advisor before finishing." It makes Advisor call timing noticeably more precise.
Pro Tip: Advisor calls show up as a separate line item in your token ledger, so you can track whether Advisor is pulling its weight on a project-by-project basis. And if you use /compact to compress the conversation, prior Advisor recommendations are preserved — making it a useful knowledge layer that compounds over long sessions.
Deep Dive Resources
Heads Up: Advisor output doesn't stream, so there's a brief pause while sub-reasoning runs. There's also the risk of the main model being overconfident — if it doesn't think it needs help, it won't call the Advisor. Explicit system prompt instructions help compensate.
The real potential of Advisor isn't just as a standalone feature — it's infrastructure for a future model strategy. If a next-generation model (like Mythos) arrives that's far more expensive than Opus, running it as your main model would be prohibitive. But set it as an Advisor, and you can tap its intelligence only for critical decisions while keeping costs under control.
According to Anthropic's official documentation, setting Sonnet to medium effort with an Opus Advisor attached delivers intelligence comparable to default-effort Sonnet alone — but at lower cost. For maximum intelligence, the default effort + Advisor combo is the strongest option available.
Subagent vs. Advisor — which should you use? If Sonnet is your main model, Advisor is the right call. If Opus is your main model, subagents (with independent context) are the better fit. Here's the key reason: Advisor shares the full transcript, so if the main model has framed the problem wrong, the Advisor falls into the same trap. Subagents start with a clean slate, which means they can offer a genuinely fresh perspective.




