The AI coding agent ranking just changed. In April 2026, Claude Opus 4.7 hit 64.3% on SWE-bench Pro — beating both GPT-5.4 (57.7%) and Gemini 3.1 Pro (54.2%). And now it's live on Amazon Bedrock. This isn't just "another access channel" — something specifically changed when it arrived on Bedrock.

Quick Summary
#1 on SWE-bench Pro Adaptive Thinking introduced temperature param removed Bedrock enterprise infra Start in 3 lines of code

What's the 64.3%?

There's a benchmark called SWE-bench. It measures how well AI can resolve bugs and feature requests pulled from real GitHub open-source repos. SWE-bench Pro is the hardest version — it uses actual production issues from major open-source projects. It's the most realistic indicator of "how useful is this coding agent in the real world."

Opus 4.7 scored 64.3%. That's up from 53.4% in Opus 4.6 — a 10.9-point improvement. It leads GPT-5.4 (57.7%) by 6.6 points and Gemini 3.1 Pro (54.2%) by over 10 points. If you're building or using a coding agent, this gap is genuinely noticeable in practice.

64.3%
SWE-bench Pro (Opus 4.7)
87.6%
SWE-bench Verified
77.3%
MCP-Atlas tool use (best-in-class)

It's not just coding. On MCP-Atlas — which measures how well an AI handles external tools — Opus 4.7 hit 77.3%, ahead of GPT-5.4 (75.3%) and Gemini (73.9%). That's the key metric for building multi-agent workflows. The one regression: BrowseComp (web research) dropped to 79.3% from 83.7% in 4.6. The team focused on coding and tool use, and made a trade-off on web search.

Vision got a major upgrade too. Max image resolution jumped to 2,576 pixels on the long edge — more than 3x previous models. That matters for UI screenshot analysis, complex diagram parsing, and dense document processing. CharXiv visual reasoning jumped 13 points to 82.1% (from 69.1%).

What actually changed?

The biggest technical change in Opus 4.7 is Adaptive Thinking. Up through Opus 4.6, you had to manually set thinking.type: "enabled" and budget_tokens — telling the model "think for up to 1,000 tokens on this task" or "use 5,000 tokens here." Developer-tuned, every call. In 4.7, that's gone.

With 4.7, it's just thinking.type: "adaptive". The model judges task complexity itself and allocates reasoning tokens automatically. Simple questions get minimal compute; complex refactoring gets deep thinking. No more budget_tokens tuning — it's fully automatic.

Opus 4.6 Opus 4.7
Reasoning setup thinking.type: "enabled" + manual budget_tokens Just thinking.type: "adaptive"
temperature/top_p Adjustable Not supported — remove from requests
SWE-bench Pro 53.4% 64.3% (+10.9pts)
Image resolution Previous level Up to 2,576px on long edge (3x+)
Prompt cache TTL 5 minutes 5 min · 1 hour (your choice)
Visual reasoning (CharXiv) 69.1% 82.1% (+13pts)

Migration warning from 4.6

Plugging Opus 4.6 code directly into 4.7 will throw a 400 error. You need to change thinking.type to "adaptive" and completely remove temperature, top_p, and top_k parameters. budget_tokens is also gone — Adaptive Thinking handles this automatically.

Pricing is unchanged: $5/M input tokens, $25/M output tokens — same as Opus 4.6. One thing to note: a new tokenizer means the same content may generate 1.0–1.35x more tokens than before. Actual costs may increase slightly, heads up.

The Quick Start: Bedrock in 5 Steps

  1. Set up AWS account + Bedrock API key
    Generate a long-term API key in the Amazon Bedrock console. Set it as the AWS_BEARER_TOKEN_BEDROCK environment variable.
  2. Install the SDK
    For the Messages API: pip install -U "anthropic[bedrock]". For Converse/Invoke: pip install boto3. Pick one.
  3. Send your first request
    Model ID is anthropic.claude-opus-4-7, region defaults to us-east-1. For thinking, use only {"type": "adaptive"} — using enabled or budget_tokens throws a 400.
  4. Optimize costs with prompt caching
    Set cache checkpoints for repeated system prompts or documents (min 4,096 tokens). Choose 5-min or 1-hour TTL. Big savings on repeated calls.
  5. Reduce latency with Geo inference
    From Asia, use jp.anthropic.claude-opus-4-7 (Tokyo/Osaka routing) or global.anthropic.claude-opus-4-7 for automatic optimal region.

Bedrock's enterprise edge

Bedrock's next-generation inference engine prevents operator access to customer data. If you're already running VPC, IAM, and CloudWatch in AWS, you get enterprise-grade data isolation with no extra security setup.

If You Want to Dig Deeper

Introducing Claude Opus 4.7 — Anthropic The official launch post. Covers Adaptive Thinking design principles, safety evaluations, and cross-platform availability. anthropic.com

Claude Opus 4.7 in Amazon Bedrock — AWS Blog Official Bedrock launch post with Playground walkthrough, API code samples, and regional availability details. aws.amazon.com

Claude Opus 4.7 Benchmarks Explained — Vellum AI Deep-dive on MCP-Atlas, OSWorld, CharXiv, and side-by-side comparisons with GPT-5.4 and Gemini 3.1 Pro. vellum.ai

Amazon Bedrock Model Card — AWS Docs Adaptive Thinking migration guide, prompt caching setup, service tiers, and per-region routing specs in one place. docs.aws.amazon.com

Claude Opus 4.7 vs GPT-5.5 — DataCamp Side-by-side on coding, reasoning, and pricing. Includes areas where GPT-5.5 still leads (Terminal-Bench). datacamp.com