Table of Contents
One of the first questions you face when using Claude is: "Should I use Opus, Sonnet, or Haiku?"
The three models differ by up to 5x in API pricing, with clear performance differences. But the most expensive model isn't always the best choice. Haiku may be more practical for some tasks, while others genuinely require Opus.
This article compares the latest pricing, performance, and speed as of April 2026, with per-task cost estimates to help you make the right choice.
1. The Three Models at a Glance
| Model | Position | Released | In a nutshell |
|---|---|---|---|
| Opus 4.6 | Flagship | Feb 2026 | Most intelligent. For agents and complex coding |
| Sonnet 4.6 | Balanced | Feb 2026 | Best speed-intelligence balance. Ideal for daily use |
| Haiku 4.5 | Fast & Affordable | Oct 2025 | Fastest. For high-volume and real-time tasks |
The names reflect literary length. An opus (a major work) represents the deepest thinking, a sonnet (a 14-line poem) offers balanced depth, and a haiku (a 3-line poem) delivers quick, concise responses.
2. API Pricing Comparison
Standard Pricing (per Million Tokens)
| Model | Input | Output | Batch Input | Batch Output | Cache Hit |
|---|---|---|---|---|---|
| Opus 4.6 | $5 | $25 | $2.50 | $12.50 | $0.50 |
| Sonnet 4.6 | $3 | $15 | $1.50 | $7.50 | $0.30 |
| Haiku 4.5 | $1 | $5 | $0.50 | $2.50 | $0.10 |
The gap between the most expensive (Opus output at $25/MTok) and cheapest (Haiku output at $5/MTok) is 5x. However, Opus 4.6 is actually 3x cheaper than its predecessor (Opus 4.1 was $75/MTok).
Cost Reduction Tips
The Batch API cuts costs in half, and cache hits reduce input costs to 1/10th. Combining both can achieve up to 95% cost savings. If you're doing high-volume processing, explore these options first.
What Is a "Token"?
API pricing is based on "tokens." In English, roughly 1 word ≈ 1.3 tokens. One million tokens is approximately 750,000 words — about 10 average-length novels.
3. Subscription Plans
Monthly subscriptions offer a fundamentally different pricing structure compared to API pay-as-you-go.
| Plan | Price | Available Models | Default |
|---|---|---|---|
| Free | $0 | Sonnet 4.5 only | Sonnet 4.5 |
| Pro | $20/mo | All models | Sonnet 4.6 |
| Max 5x | $100/mo | All models | Opus 4.6 |
| Max 20x | $200/mo | All models | Opus 4.6 |
Subscriptions aren't "unlimited" but have usage caps. Still, they're 15–30x cheaper than API pricing. One user reported consuming 10 billion tokens over 8 months — at API rates that would have cost $15,000+, but their Max subscription cost roughly $800.
Switching Models in Claude Code
In Claude Code, you can switch models at launch with claude --model opus or claude --model sonnet, or mid-session with /model sonnet. Pro defaults to Sonnet, Max defaults to Opus. For more details on Claude Code itself, see Claude Chat vs. Cowork vs. Code.
4. Performance Benchmarks
| Benchmark | What It Measures | Opus 4.6 | Sonnet 4.6 | Gap |
|---|---|---|---|---|
| SWE-bench Verified | Coding ability | 80.8% | 79.6% | Only 1.2pts |
| GPQA Diamond | Scientific reasoning | 91.3% | 74.1% | 17.2pts |
| OSWorld-Verified | GUI automation | 72.7% | 72.5% | Nearly equal |
| Math | Mathematical problems | — | 89% | — |
The standout finding: the coding performance gap is only 1.2 points. Sonnet 4.6 is the first Sonnet in Claude history to match the previous generation's Opus in coding benchmarks.
However, scientific reasoning (GPQA Diamond) shows a 17.2-point gap, making Opus clearly superior for academic analysis and complex logical reasoning.
Official benchmarks for Haiku 4.5 are limited, but Anthropic positions it as having "near-frontier intelligence." For straightforward tasks, it's expected to approach Sonnet-level accuracy.
5. Speed and Context Windows
| Model | Speed (approx.) | Context Window | Max Output |
|---|---|---|---|
| Opus 4.6 | ~20–30 tok/sec | 1M tokens | 128K tokens |
| Sonnet 4.6 | ~40–60 tok/sec | 1M tokens | 64K tokens |
| Haiku 4.5 | 2–5x faster than Sonnet | 200K tokens | 64K tokens |
Haiku's greatest strength is speed. It has the shortest time-to-first-token (TTFT), making it ideal for real-time chatbots and autocomplete features.
Opus offers a 1 million token context window (roughly 10–20 novels) for tasks like processing entire codebases at once. Its 128K token max output is double that of Sonnet/Haiku, suited for generating long documents in one pass.
6. Cost Estimates by Use Case
Here's what typical tasks cost at standard API rates (no caching or batching).
Scenario 1: Generate a 2,000-word article
Input: ~1,000 tokens, Output: ~2,700 tokens
| Model | Input Cost | Output Cost | Total |
|---|---|---|---|
| Opus 4.6 | $0.005 | $0.068 | ~$0.07 |
| Sonnet 4.6 | $0.003 | $0.041 | ~$0.04 |
| Haiku 4.5 | $0.001 | $0.014 | ~$0.02 |
Scenario 2: Read a code file and refactor it
Input: ~10,000 tokens (code + instructions), Output: ~5,000 tokens
| Model | Input Cost | Output Cost | Total |
|---|---|---|---|
| Opus 4.6 | $0.05 | $0.125 | ~$0.18 |
| Sonnet 4.6 | $0.03 | $0.075 | ~$0.11 |
| Haiku 4.5 | $0.01 | $0.025 | ~$0.04 |
Scenario 3: Chatbot handling 1,000 queries/day
200 input tokens + 300 output tokens per query × 1,000
| Model | Daily Cost | Monthly (30 days) |
|---|---|---|
| Opus 4.6 | $8.50 | $255 |
| Sonnet 4.6 | $5.10 | $153 |
| Haiku 4.5 | $1.70 | $51 |
For high-volume scenarios like chatbots, the monthly gap between Haiku and Opus exceeds $200. A practical approach is to use Haiku as the default and route only complex queries to Sonnet or Opus.
7. Which Model Should You Choose?
| Use Case | Recommended | Why |
|---|---|---|
| Daily coding & writing | Sonnet 4.6 | 98% of Opus coding quality, 40% cheaper, 2x faster |
| Large-scale refactoring | Opus 4.6 | 1M context window and 128K output shine here |
| Academic analysis | Opus 4.6 | 17-point gap in GPQA. Deep reasoning can't be substituted |
| Chatbots & support | Haiku 4.5 | Fastest + cheapest. Perfect for standard responses |
| Batch processing | Haiku 4.5 | 1/5 the cost, handles volume efficiently |
| Claude Code development | Sonnet 4.6 | Pro plan is sufficient. Switch to Opus only for complex architecture |
Practical Advice
When in doubt, start with Sonnet. It handles most tasks well. Only upgrade to Opus when Sonnet's output quality falls short, and downgrade to Haiku for simple, repetitive tasks. This tiered approach gives you the best cost-performance ratio.
FAQ
How big is the coding performance gap between Opus and Sonnet?
On SWE-bench Verified (a coding benchmark), Opus 4.6 scores 80.8% and Sonnet 4.6 scores 79.6% — a gap of only 1.2 points. For everyday coding, the difference is barely noticeable. Given the cost difference ($25 vs $15/MTok for output), Sonnet offers better value. However, Opus has a clear edge for large-scale architecture design and complex reasoning tasks.
Is a subscription or API pay-as-you-go cheaper?
For regular use, subscriptions are dramatically cheaper — roughly 15–30x more cost-effective than API pricing. Even the Pro plan ($20/month) would cost $180+ per month at equivalent API usage. API pricing only makes sense for very infrequent use or specific batch processing scenarios. For a comparison with ChatGPT pricing, see Claude vs ChatGPT Pricing Comparison.
How "smart" is Haiku 4.5?
Anthropic describes it as having "near-frontier intelligence." While official benchmarks are limited, it's expected to approach Sonnet-level accuracy for straightforward tasks like content classification, summarization, and Q&A. For complex reasoning or long code generation, the gap with Sonnet/Opus becomes apparent. At 1/5 the cost, it excels where "good enough quality at massive scale" is the priority.
Is Opus 4.6 cheaper than previous Opus models?
Yes, significantly. Opus 4.1 charged $75/MTok for output, while Opus 4.6 charges $25/MTok — a 3x price reduction with improved performance. The context window also expanded from 200K to 1 million tokens (5x increase), making the value proposition substantially better.