One of the first questions you face when using Claude is: "Should I use Opus, Sonnet, or Haiku?"

The three models differ by up to 5x in API pricing, with clear performance differences. But the most expensive model isn't always the best choice. Haiku may be more practical for some tasks, while others genuinely require Opus.

This article compares the latest pricing, performance, and speed as of April 2026, with per-task cost estimates to help you make the right choice.

1. The Three Models at a Glance

Claude's three models: Opus (top performance), Sonnet (balanced), Haiku (fast and affordable)
ModelPositionReleasedIn a nutshell
Opus 4.6FlagshipFeb 2026Most intelligent. For agents and complex coding
Sonnet 4.6BalancedFeb 2026Best speed-intelligence balance. Ideal for daily use
Haiku 4.5Fast & AffordableOct 2025Fastest. For high-volume and real-time tasks

The names reflect literary length. An opus (a major work) represents the deepest thinking, a sonnet (a 14-line poem) offers balanced depth, and a haiku (a 3-line poem) delivers quick, concise responses.

2. API Pricing Comparison

Claude Opus, Sonnet, and Haiku API pricing comparison: input, output, batch, and cache prices

Standard Pricing (per Million Tokens)

ModelInputOutputBatch InputBatch OutputCache Hit
Opus 4.6$5$25$2.50$12.50$0.50
Sonnet 4.6$3$15$1.50$7.50$0.30
Haiku 4.5$1$5$0.50$2.50$0.10

The gap between the most expensive (Opus output at $25/MTok) and cheapest (Haiku output at $5/MTok) is 5x. However, Opus 4.6 is actually 3x cheaper than its predecessor (Opus 4.1 was $75/MTok).

Cost Reduction Tips

The Batch API cuts costs in half, and cache hits reduce input costs to 1/10th. Combining both can achieve up to 95% cost savings. If you're doing high-volume processing, explore these options first.

What Is a "Token"?

API pricing is based on "tokens." In English, roughly 1 word ≈ 1.3 tokens. One million tokens is approximately 750,000 words — about 10 average-length novels.

3. Subscription Plans

Monthly subscriptions offer a fundamentally different pricing structure compared to API pay-as-you-go.

PlanPriceAvailable ModelsDefault
Free$0Sonnet 4.5 onlySonnet 4.5
Pro$20/moAll modelsSonnet 4.6
Max 5x$100/moAll modelsOpus 4.6
Max 20x$200/moAll modelsOpus 4.6

Subscriptions aren't "unlimited" but have usage caps. Still, they're 15–30x cheaper than API pricing. One user reported consuming 10 billion tokens over 8 months — at API rates that would have cost $15,000+, but their Max subscription cost roughly $800.

Switching Models in Claude Code

In Claude Code, you can switch models at launch with claude --model opus or claude --model sonnet, or mid-session with /model sonnet. Pro defaults to Sonnet, Max defaults to Opus. For more details on Claude Code itself, see Claude Chat vs. Cowork vs. Code.

4. Performance Benchmarks

BenchmarkWhat It MeasuresOpus 4.6Sonnet 4.6Gap
SWE-bench VerifiedCoding ability80.8%79.6%Only 1.2pts
GPQA DiamondScientific reasoning91.3%74.1%17.2pts
OSWorld-VerifiedGUI automation72.7%72.5%Nearly equal
MathMathematical problems89%

The standout finding: the coding performance gap is only 1.2 points. Sonnet 4.6 is the first Sonnet in Claude history to match the previous generation's Opus in coding benchmarks.

However, scientific reasoning (GPQA Diamond) shows a 17.2-point gap, making Opus clearly superior for academic analysis and complex logical reasoning.

Official benchmarks for Haiku 4.5 are limited, but Anthropic positions it as having "near-frontier intelligence." For straightforward tasks, it's expected to approach Sonnet-level accuracy.

5. Speed and Context Windows

ModelSpeed (approx.)Context WindowMax Output
Opus 4.6~20–30 tok/sec1M tokens128K tokens
Sonnet 4.6~40–60 tok/sec1M tokens64K tokens
Haiku 4.52–5x faster than Sonnet200K tokens64K tokens

Haiku's greatest strength is speed. It has the shortest time-to-first-token (TTFT), making it ideal for real-time chatbots and autocomplete features.

Opus offers a 1 million token context window (roughly 10–20 novels) for tasks like processing entire codebases at once. Its 128K token max output is double that of Sonnet/Haiku, suited for generating long documents in one pass.

6. Cost Estimates by Use Case

Here's what typical tasks cost at standard API rates (no caching or batching).

Scenario 1: Generate a 2,000-word article

Input: ~1,000 tokens, Output: ~2,700 tokens

ModelInput CostOutput CostTotal
Opus 4.6$0.005$0.068~$0.07
Sonnet 4.6$0.003$0.041~$0.04
Haiku 4.5$0.001$0.014~$0.02

Scenario 2: Read a code file and refactor it

Input: ~10,000 tokens (code + instructions), Output: ~5,000 tokens

ModelInput CostOutput CostTotal
Opus 4.6$0.05$0.125~$0.18
Sonnet 4.6$0.03$0.075~$0.11
Haiku 4.5$0.01$0.025~$0.04

Scenario 3: Chatbot handling 1,000 queries/day

200 input tokens + 300 output tokens per query × 1,000

ModelDaily CostMonthly (30 days)
Opus 4.6$8.50$255
Sonnet 4.6$5.10$153
Haiku 4.5$1.70$51

For high-volume scenarios like chatbots, the monthly gap between Haiku and Opus exceeds $200. A practical approach is to use Haiku as the default and route only complex queries to Sonnet or Opus.

7. Which Model Should You Choose?

Model selection flowchart: choosing Opus, Sonnet, or Haiku based on task complexity and volume
Use CaseRecommendedWhy
Daily coding & writingSonnet 4.698% of Opus coding quality, 40% cheaper, 2x faster
Large-scale refactoringOpus 4.61M context window and 128K output shine here
Academic analysisOpus 4.617-point gap in GPQA. Deep reasoning can't be substituted
Chatbots & supportHaiku 4.5Fastest + cheapest. Perfect for standard responses
Batch processingHaiku 4.51/5 the cost, handles volume efficiently
Claude Code developmentSonnet 4.6Pro plan is sufficient. Switch to Opus only for complex architecture

Practical Advice

When in doubt, start with Sonnet. It handles most tasks well. Only upgrade to Opus when Sonnet's output quality falls short, and downgrade to Haiku for simple, repetitive tasks. This tiered approach gives you the best cost-performance ratio.

FAQ

How big is the coding performance gap between Opus and Sonnet?

On SWE-bench Verified (a coding benchmark), Opus 4.6 scores 80.8% and Sonnet 4.6 scores 79.6% — a gap of only 1.2 points. For everyday coding, the difference is barely noticeable. Given the cost difference ($25 vs $15/MTok for output), Sonnet offers better value. However, Opus has a clear edge for large-scale architecture design and complex reasoning tasks.

Is a subscription or API pay-as-you-go cheaper?

For regular use, subscriptions are dramatically cheaper — roughly 15–30x more cost-effective than API pricing. Even the Pro plan ($20/month) would cost $180+ per month at equivalent API usage. API pricing only makes sense for very infrequent use or specific batch processing scenarios. For a comparison with ChatGPT pricing, see Claude vs ChatGPT Pricing Comparison.

How "smart" is Haiku 4.5?

Anthropic describes it as having "near-frontier intelligence." While official benchmarks are limited, it's expected to approach Sonnet-level accuracy for straightforward tasks like content classification, summarization, and Q&A. For complex reasoning or long code generation, the gap with Sonnet/Opus becomes apparent. At 1/5 the cost, it excels where "good enough quality at massive scale" is the priority.

Is Opus 4.6 cheaper than previous Opus models?

Yes, significantly. Opus 4.1 charged $75/MTok for output, while Opus 4.6 charges $25/MTok — a 3x price reduction with improved performance. The context window also expanded from 200K to 1 million tokens (5x increase), making the value proposition substantially better.