Complete pricing breakdown for all 13 Anthropic API models. Compare input and output costs per million tokens for Claude Opus 4, Claude Sonnet 4, Claude Haiku 3.5, and every other model. Includes a cost calculator and side-by-side comparison with OpenAI and Google.
| Model | Input $/1M | Output $/1M |
|---|---|---|
| Claude 3 Haiku | $0.250 | $1.25 |
| Claude 3.5 Haiku | $0.800 | $4.00 |
| Claude Haiku 4.5 | $1.00 | $5.00 |
| Claude Sonnet 4.6 | $3.00 | $15.00 |
| Claude Sonnet 4.5 | $3.00 | $15.00 |
| Claude Sonnet 4 | $3.00 | $15.00 |
| Claude 3.7 Sonnet | $3.00 | $15.00 |
| Claude 3.7 Sonnet (thinking) | $3.00 | $15.00 |
| Claude Opus 4.6 | $5.00 | $25.00 |
| Claude Opus 4.5 | $5.00 | $25.00 |
| Claude 3.5 Sonnet | $6.00 | $30.00 |
| Claude Opus 4.1 | $15.00 | $75.00 |
| Claude Opus 4 | $15.00 | $75.00 |
| Model | Input $/1M | Output $/1M |
|---|---|---|
| Claude 3 Haiku | $0.250 | $1.25 |
| Claude 3.5 Haiku | $0.800 | $4.00 |
| Model | Input $/1M | Output $/1M |
|---|---|---|
| Claude Haiku 4.5 | $1.00 | $5.00 |
| Model | Input $/1M | Output $/1M |
|---|---|---|
| Claude Sonnet 4.6 | $3.00 | $15.00 |
| Claude Sonnet 4.5 | $3.00 | $15.00 |
| Claude Sonnet 4 | $3.00 | $15.00 |
| Claude 3.7 Sonnet | $3.00 | $15.00 |
| Claude 3.7 Sonnet (thinking) | $3.00 | $15.00 |
| Claude Opus 4.6 | $5.00 | $25.00 |
| Claude Opus 4.5 | $5.00 | $25.00 |
| Claude 3.5 Sonnet | $6.00 | $30.00 |
| Claude Opus 4.1 | $15.00 | $75.00 |
| Claude Opus 4 | $15.00 | $75.00 |
See how Anthropic API pricing stacks up against OpenAI (GPT) and Google (Gemini) models. All prices in USD per million tokens.
| Model | In | Out |
|---|---|---|
| Claude 3 Haiku | $0.250 | $1.25 |
| Claude 3.5 Haiku | $0.800 | $4.00 |
| Claude Haiku 4.5 | $1.00 | $5.00 |
| Claude Sonnet 4.6 | $3.00 | $15.00 |
| Claude Sonnet 4.5 | $3.00 | $15.00 |
| Claude Sonnet 4 | $3.00 | $15.00 |
| Claude 3.7 Sonnet | $3.00 | $15.00 |
| Claude 3.7 Sonnet (thinking) | $3.00 | $15.00 |
| Model | In | Out |
|---|---|---|
| gpt-oss-120b (free) | Free | Free |
| gpt-oss-20b (free) | Free | Free |
| Sora | Free | Free |
| gpt-oss-20b | $0.030 | $0.140 |
| gpt-oss-120b | $0.039 | $0.190 |
| gpt-oss-safeguard-20b | $0.075 | $0.300 |
| GPT-5 Nano | $0.050 | $0.400 |
| GPT-4.1 Nano | $0.100 | $0.400 |
| Model | In | Out |
|---|---|---|
| Gemma 3n 2B (free) | Free | Free |
| Gemma 3n 4B (free) | Free | Free |
| Gemma 3 4B (free) | Free | Free |
| Gemma 3 12B (free) | Free | Free |
| Gemma 3 27B (free) | Free | Free |
| Gemma 3n 4B | $0.020 | $0.040 |
| Gemma 3 4B | $0.040 | $0.080 |
| Gemma 2 9B | $0.030 | $0.090 |
Estimated daily and monthly costs for common usage patterns. Assumes an average of ~1,000 input tokens and ~500 output tokens per request.
| Model | $/1M In | $/1M Out |
|---|---|---|
| Claude 3 Haiku | $0.250 | $1.25 |
| Claude Sonnet 4.6 | $3.00 | $15.00 |
| Claude 3.7 Sonnet | $3.00 | $15.00 |
| Claude Opus 4.5 | $5.00 | $25.00 |
| Claude Opus 4 | $15.00 | $75.00 |
Note: Actual costs vary with prompt length, response length, prompt caching, and batch processing. Anthropic offers prompt caching with up to 90% input cost reduction for repeated prefixes. Try the interactive calculator for custom estimates.
Anthropic charges per token, not per request. A token is roughly 3/4 of a word. The sentence "Hello, how are you?" is about 6 tokens. Prices are quoted per million tokens. Input tokens (your prompts) are cheaper than output tokens (the model's response) because output generation requires more computation.
Claude Opus is Anthropic's most powerful model for complex reasoning and coding. Claude Sonnet offers the best balance of performance and cost for most production use cases. Claude Haiku is the fastest and most affordable, ideal for high-volume workloads like classification, extraction, and simple Q&A.
Claude's extended thinking feature allows models to reason step-by-step before responding. This uses additional output tokens for the thinking process, which increases cost per request. However, it significantly improves accuracy on complex math, coding, and analysis tasks, often reducing the number of retries needed.
Use prompt caching to reduce input costs by up to 90% for repeated system prompts and context. Start with Claude Haiku for simple tasks and only upgrade to Sonnet or Opus when the task demands it. Use shorter prompts and set appropriate max_tokens limits to control output costs.
Browse pricing for coding, image generation, and more.
Compare with GPT-4o, o3, and all OpenAI model costs.
Compare with Gemini 2.5 Pro, Flash, and all Google models.
Find the most affordable models across all providers.
Claude Opus 4 pricing is $5.00/1M input tokens and $25.00/1M output tokens. Opus is Anthropic's most capable model, designed for complex reasoning, coding, and analysis tasks that require the highest level of intelligence.
Anthropic offers 0 free models via their API. New accounts receive a small amount of free credits to get started. After that, usage is billed per token. Claude Haiku is the most affordable option for high-volume production workloads, while Claude Sonnet offers the best balance of cost and capability.
The cheapest paid Anthropic model is Claude 3 Haiku at $0.250/1M input and $1.25/1M output tokens. It offers 200K context window and supports vision, function calling, and streaming.
Anthropic and OpenAI have similar pricing structures. Anthropic's average output price is $24.25/1M tokens across 13 paid models. OpenAI offers 62 models with varying price points. Claude Sonnet 4 competes directly with GPT-4o on both price and capability. For the most accurate comparison, see our side-by-side pricing table above.
Anthropic charges per token for API usage. Tokens are the basic units of text -- roughly 3/4 of a word. Pricing is split into input tokens (your prompts and context) and output tokens (the model's response). Output tokens are typically more expensive because they require more computation. Prices are quoted per million tokens.
Yes. Anthropic's prompt caching feature reduces input costs by up to 90% for repeated prefixes. When you send the same system prompt or context across multiple requests, only the first request pays full price. Subsequent requests use the cached version at a significant discount, making it ideal for chatbots, RAG systems, and multi-turn conversations.