Compare context windows, output capacity, and cost efficiency across 328+ models. Data sourced live from OpenRouter.
Note: This page shows capacity specs and pricing from OpenRouter. Real-world latency and tokens-per-second vary by load, prompt length, and provider infrastructure. For speed benchmarks, see Artificial Analysis or the model provider's own documentation.
Largest Context
2M
Grok 4.20 Multi-Agent Beta
Largest Output
1.0M
MiniMax-01
Cheapest (non-free)
$0.01/1M in
LFM2-8B-A1B
Best Output Ratio
200%
Llama 3 8B Instruct
| Model | Provider | Context Window | Max Output | Output Ratio | Input $/1M | Output $/1M | Efficiency (derived) | Capabilities |
|---|---|---|---|---|---|---|---|---|
| Grok 4.20 Multi-Agent Beta | xAI | 2M | — | — | $2.00 | $6.00 | 44 | |
| Grok 4.20 Beta | xAI | 2M | — | — | $2.00 | $6.00 | 44 | |
| Grok 4.1 Fast | xAI | 2M | 30K | 2% | $0.20 | $0.50 | 91 | |
| Grok 4 Fast | xAI | 2M | 30K | 2% | $0.20 | $0.50 | 91 | |
| GPT-5.4 Pro | OpenAI | 1.1M | 128K | 12% | $30.00 | $180.00 | 18 | |
| GPT-5.4 | OpenAI | 1.1M | 128K | 12% | $2.50 | $15.00 | 39 | |
| MiMo-V2-Pro | Xiaomi | 1.0M | 131K | 13% | $1.00 | $3.00 | 55 | |
| Gemini 3.1 Flash Lite Preview | 1.0M | 66K | 6% | $0.25 | $1.50 | 83 | ||
| Gemini 3.1 Pro Preview Custom Tools | 1.0M | 66K | 6% | $2.00 | $12.00 | 42 | ||
| Gemini 3.1 Pro Preview | 1.0M | 66K | 6% | $2.00 | $12.00 | 42 | ||
| Gemini 3 Flash Preview | 1.0M | 66K | 6% | $0.50 | $3.00 | 69 | ||
| Gemini 3 Pro Preview | 1.0M | 66K | 6% | $2.00 | $12.00 | 42 | ||
| Gemini 2.5 Flash Lite Preview 09-2025 | 1.0M | 66K | 6% | $0.10 | $0.40 | 96 | ||
| Gemini 2.5 Flash Lite | 1.0M | 66K | 6% | $0.10 | $0.40 | 96 | ||
| Gemini 2.5 Flash | 1.0M | 66K | 6% | $0.30 | $2.50 | 79 | ||
| Gemini 2.5 Pro | 1.0M | 66K | 6% | $1.25 | $10.00 | 50 | ||
| Gemini 2.5 Pro Preview 06-05 | 1.0M | 66K | 6% | $1.25 | $10.00 | 50 | ||
| Gemini 2.5 Pro Preview 05-06 | 1.0M | 66K | 6% | $1.25 | $10.00 | 50 | ||
| Llama 4 Maverick | Meta | 1.0M | 16K | 2% | $0.15 | $0.60 | 91 | |
| Gemini 2.0 Flash Lite | 1.0M | 8K | 1% | $0.07 | $0.30 | 99 | ||
| Gemini 2.0 Flash | 1.0M | 8K | 1% | $0.10 | $0.40 | 96 | ||
| GPT-4.1 | OpenAI | 1.0M | 33K | 3% | $2.00 | $8.00 | 42 | |
| GPT-4.1 Mini | OpenAI | 1.0M | 33K | 3% | $0.40 | $1.60 | 74 | |
| GPT-4.1 Nano | OpenAI | 1.0M | 33K | 3% | $0.10 | $0.40 | 96 | |
| Palmyra X5 | Writer | 1.0M | 8K | 1% | $0.60 | $6.00 | 65 | |
| MiniMax-01 | MiniMax | 1.0M | 1.0M | 100% | $0.20 | $1.10 | 86 | |
| Qwen3.5-Flash | Alibaba | 1M | 66K | 7% | $0.07 | $0.26 | 100 | |
| Claude Sonnet 4.6 | Anthropic | 1M | 128K | 13% | $3.00 | $15.00 | 36 | |
| Qwen3.5 Plus 2026-02-15 | Alibaba | 1M | 66K | 7% | $0.26 | $1.56 | 82 | |
| Claude Opus 4.6 | Anthropic | 1M | 128K | 13% | $5.00 | $25.00 | 30 | |
| Nova 2 Lite | Amazon | 1M | 66K | 7% | $0.30 | $2.50 | 79 | |
| Nova Premier 1.0 | Amazon | 1M | 32K | 3% | $2.50 | $12.50 | 39 | |
| Claude Sonnet 4.5 | Anthropic | 1M | 64K | 6% | $3.00 | $15.00 | 36 | |
| Qwen3 Coder Plus | Alibaba | 1M | 66K | 7% | $0.65 | $3.25 | 63 | |
| Qwen3 Coder Flash | Alibaba | 1M | 66K | 7% | $0.20 | $0.97 | 87 | |
| Qwen Plus 0728 (thinking) | Alibaba | 1M | 33K | 3% | $0.26 | $0.78 | 82 | |
| Qwen Plus 0728 | Alibaba | 1M | 33K | 3% | $0.26 | $0.78 | 82 | |
| MiniMax M1 | MiniMax | 1M | 40K | 4% | $0.40 | $2.20 | 73 | |
| Qwen-Plus | Alibaba | 1M | 33K | 3% | $0.26 | $0.78 | 82 | |
| GPT-5.4 Nano | OpenAI | 400K | 128K | 32% | $0.20 | $1.25 | 81 | |
| GPT-5.4 Mini | OpenAI | 400K | 128K | 32% | $0.75 | $4.50 | 56 | |
| GPT-5.3-Codex | OpenAI | 400K | 128K | 32% | $1.75 | $14.00 | 41 | |
| GPT-5.2-Codex | OpenAI | 400K | 128K | 32% | $1.75 | $14.00 | 41 | |
| GPT-5.2 Pro | OpenAI | 400K | 128K | 32% | $21.00 | $168.00 | 19 | |
| GPT-5.2 | OpenAI | 400K | 128K | 32% | $1.75 | $14.00 | 41 | |
| GPT-5.1-Codex-Max | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 47 | |
| GPT-5.1 | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 47 | |
| GPT-5.1-Codex | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 47 | |
| GPT-5.1-Codex-Mini | OpenAI | 400K | 100K | 25% | $0.25 | $2.00 | 77 | |
| GPT-5 Pro | OpenAI | 400K | 128K | 32% | $15.00 | $120.00 | 20 | |
| GPT-5 Codex | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 47 | |
| GPT-5 | OpenAI | 400K | 128K | 32% | $1.25 | $10.00 | 47 | |
| GPT-5 Mini | OpenAI | 400K | 128K | 32% | $0.25 | $2.00 | 77 | |
| GPT-5 Nano | OpenAI | 400K | 128K | 32% | $0.05 | $0.40 | 95 | |
| Llama 4 Scout | Meta | 328K | 16K | 5% | $0.08 | $0.30 | 90 | |
| Nova Lite 1.0 | Amazon | 300K | 5K | 2% | $0.06 | $0.24 | 92 | |
| Nova Pro 1.0 | Amazon | 300K | 5K | 2% | $0.80 | $3.20 | 54 | |
| MiMo-V2-Omni | Xiaomi | 262K | 66K | 25% | $0.40 | $2.00 | 66 | |
| Mistral Small 4 | Mistral AI | 262K | — | — | $0.15 | $0.60 | 82 | |
| Nemotron 3 Super (free) | NVIDIA | 262K | 262K | 100% | Free | Free | 99 | |
| Nemotron 3 Super | NVIDIA | 262K | — | — | $0.10 | $0.50 | 87 | |
| Seed-2.0-Lite | ByteDance | 262K | 131K | 50% | $0.25 | $2.00 | 75 | |
| Seed-2.0-Mini | ByteDance | 262K | 131K | 50% | $0.10 | $0.40 | 87 | |
| Qwen3.5-35B-A3B | Alibaba | 262K | 66K | 25% | $0.16 | $1.30 | 81 | |
| Qwen3.5-27B | Alibaba | 262K | 66K | 25% | $0.20 | $1.56 | 78 | |
| Qwen3.5-122B-A10B | Alibaba | 262K | 66K | 25% | $0.26 | $2.08 | 74 | |
| Qwen3.5 397B A17B | Alibaba | 262K | 66K | 25% | $0.39 | $2.34 | 67 | |
| Qwen3 Max Thinking | Alibaba | 262K | 33K | 13% | $0.78 | $3.90 | 54 | |
| Qwen3 Coder Next | Alibaba | 262K | 66K | 25% | $0.12 | $0.75 | 85 | |
| Kimi K2.5 | Moonshot AI | 262K | 66K | 25% | $0.45 | $2.20 | 64 | |
| Seed 1.6 Flash | ByteDance | 262K | 33K | 13% | $0.07 | $0.30 | 89 | |
| Seed 1.6 | ByteDance | 262K | 33K | 13% | $0.25 | $2.00 | 75 | |
| MiMo-V2-Flash | Xiaomi | 262K | 66K | 25% | $0.09 | $0.29 | 88 | |
| Nemotron 3 Nano 30B A3B | NVIDIA | 262K | — | — | $0.05 | $0.20 | 92 | |
| Devstral 2 2512 | Mistral AI | 262K | — | — | $0.40 | $2.00 | 66 | |
| Ministral 3 14B 2512 | Mistral AI | 262K | — | — | $0.20 | $0.20 | 78 | |
| Ministral 3 8B 2512 | Mistral AI | 262K | — | — | $0.15 | $0.15 | 82 | |
| Mistral Large 3 2512 | Mistral AI | 262K | — | — | $0.50 | $1.50 | 62 | |
| Qwen3 VL 235B A22B Instruct | Alibaba | 262K | — | — | $0.20 | $0.88 | 78 | |
| Qwen3 Max | Alibaba | 262K | 33K | 13% | $0.78 | $3.90 | 54 | |
| Qwen3 Next 80B A3B Instruct (free) | Alibaba | 262K | — | — | Free | Free | 99 | |
| Qwen3 Next 80B A3B Instruct | Alibaba | 262K | — | — | $0.09 | $1.10 | 88 | |
| Qwen3 30B A3B Instruct 2507 | Alibaba | 262K | 262K | 100% | $0.09 | $0.30 | 88 | |
| Qwen3 Coder 480B A35B | Alibaba | 262K | — | — | $0.22 | $1.00 | 77 | |
| Qwen3 235B A22B Instruct 2507 | Alibaba | 262K | — | — | $0.07 | $0.10 | 90 | |
| Qwen3 Coder 480B A35B (free) | Alibaba | 262K | 262K | 100% | Free | Free | 99 | |
| Qwen3.5-9B | Alibaba | 256K | — | — | $0.05 | $0.15 | 92 | |
| Step 3.5 Flash (free) | StepFun | 256K | 256K | 100% | Free | Free | 98 | |
| Step 3.5 Flash | StepFun | 256K | 256K | 100% | $0.10 | $0.30 | 86 | |
| Nemotron 3 Nano 30B A3B (free) | NVIDIA | 256K | — | — | Free | Free | 98 | |
| KAT-Coder-Pro V1 | Kuaishou | 256K | 128K | 50% | $0.21 | $0.83 | 77 | |
| Grok Code Fast 1 | xAI | 256K | 10K | 4% | $0.20 | $1.50 | 78 | |
| Jamba Large 1.7 | AI21 Labs | 256K | 4K | 2% | $2.00 | $8.00 | 38 | |
| Codestral 2508 | Mistral AI | 256K | — | — | $0.30 | $0.90 | 71 | |
| Grok 4 | xAI | 256K | — | — | $3.00 | $15.00 | 33 | |
| Command A | Cohere | 256K | 8K | 3% | $2.50 | $10.00 | 35 | |
| MiniMax M2.7 | MiniMax | 205K | 131K | 64% | $0.30 | $1.20 | 70 | |
| Claude Opus 4.5 | Anthropic | 200K | 64K | 32% | $5.00 | $25.00 | 27 | |
| Sonar Pro Search | Perplexity | 200K | 8K | 4% | $3.00 | $15.00 | 32 | |
| Claude Haiku 4.5 | Anthropic | 200K | 64K | 32% | $1.00 | $5.00 | 48 | |
| o3 Deep Research | OpenAI | 200K | 100K | 50% | $10.00 | $40.00 | 22 | |
| o4 Mini Deep Research | OpenAI | 200K | 100K | 50% | $2.00 | $8.00 | 37 | |
| Claude Opus 4.1 | Anthropic | 200K | 32K | 16% | $15.00 | $75.00 | 19 | |
| o3 Pro | OpenAI | 200K | 100K | 50% | $20.00 | $80.00 | 18 | |
| Claude Opus 4 | Anthropic | 200K | 32K | 16% | $15.00 | $75.00 | 19 | |
| Claude Sonnet 4 | Anthropic | 200K | 64K | 32% | $3.00 | $15.00 | 32 | |
| o4 Mini High | OpenAI | 200K | 100K | 50% | $1.10 | $4.40 | 47 | |
| o3 | OpenAI | 200K | 100K | 50% | $2.00 | $8.00 | 37 | |
| o4 Mini | OpenAI | 200K | 100K | 50% | $1.10 | $4.40 | 47 | |
| o1-pro | OpenAI | 200K | 100K | 50% | $150.00 | $600.00 | 12 | |
| Sonar Pro | Perplexity | 200K | 8K | 4% | $3.00 | $15.00 | 32 | |
| Claude 3.7 Sonnet | Anthropic | 200K | 64K | 32% | $3.00 | $15.00 | 32 | |
| Claude 3.7 Sonnet (thinking) | Anthropic | 200K | 64K | 32% | $3.00 | $15.00 | 32 | |
| o3 Mini High | OpenAI | 200K | 100K | 50% | $1.10 | $4.40 | 47 | |
| o3 Mini | OpenAI | 200K | 100K | 50% | $1.10 | $4.40 | 47 | |
| o1 | OpenAI | 200K | 100K | 50% | $15.00 | $60.00 | 19 | |
| Claude 3.5 Haiku | Anthropic | 200K | 8K | 4% | $0.80 | $4.00 | 52 | |
| Claude 3.5 Sonnet | Anthropic | 200K | 8K | 4% | $6.00 | $30.00 | 25 | |
| Claude 3 Haiku | Anthropic | 200K | 4K | 2% | $0.25 | $1.25 | 73 | |
| Composer 2 | Cursor | 200K | 66K | 33% | $0.50 | $2.50 | 61 | |
| Composer 2 Fast | Cursor | 200K | 66K | 33% | $1.50 | $7.50 | 42 | |
| MiniMax M2.5 (free) | MiniMax | 197K | 197K | 100% | Free | Free | 96 | |
| MiniMax M2.5 | MiniMax | 197K | 66K | 33% | $0.20 | $1.17 | 76 | |
| MiniMax M2.1 | MiniMax | 197K | — | — | $0.27 | $0.95 | 72 | |
| MiniMax M2 | MiniMax | 197K | 197K | 100% | $0.26 | $1.00 | 72 | |
| DeepSeek V3.2 Speciale | DeepSeek | 164K | 164K | 100% | $0.40 | $1.20 | 64 | |
| DeepSeek V3.2 | DeepSeek | 164K | — | — | $0.26 | $0.38 | 71 | |
| DeepSeek V3.2 Exp | DeepSeek | 164K | 66K | 40% | $0.27 | $0.41 | 70 | |
| DeepSeek V3.1 Terminus | DeepSeek | 164K | — | — | $0.21 | $0.79 | 74 | |
| R1 0528 | DeepSeek | 164K | 66K | 40% | $0.45 | $2.15 | 62 | |
| Llama Guard 4 12B | Meta | 164K | — | — | $0.18 | $0.18 | 77 | |
| DeepSeek V3 0324 | DeepSeek | 164K | — | — | $0.20 | $0.77 | 75 | |
| DeepSeek V3 | DeepSeek | 164K | 164K | 100% | $0.32 | $0.89 | 68 | |
| Qwen3 Coder 30B A3B Instruct | Alibaba | 160K | 33K | 20% | $0.07 | $0.27 | 86 | |
| Aion-2.0 | aion-labs | 131K | 33K | 25% | $0.80 | $1.60 | 50 | |
| Ministral 3 3B 2512 | Mistral AI | 131K | — | — | $0.10 | $0.10 | 82 | |
| Trinity Mini (free) | arcee-ai | 131K | — | — | Free | Free | 93 | |
| Trinity Mini | arcee-ai | 131K | 131K | 100% | $0.04 | $0.15 | 87 | |
| Kimi K2 Thinking | Moonshot AI | 131K | — | — | $0.47 | $2.00 | 60 | |
| gpt-oss-safeguard-20b | OpenAI | 131K | 66K | 50% | $0.07 | $0.30 | 84 | |
| Nemotron Nano 12B 2 VL | NVIDIA | 131K | — | — | $0.20 | $0.60 | 74 | |
| Qwen3 VL 32B Instruct | Alibaba | 131K | 33K | 25% | $0.10 | $0.42 | 81 | |
| Qwen3 VL 8B Thinking | Alibaba | 131K | 33K | 25% | $0.12 | $1.36 | 80 | |
| Qwen3 VL 8B Instruct | Alibaba | 131K | 33K | 25% | $0.08 | $0.50 | 84 | |
| Llama 3.3 Nemotron Super 49B V1.5 | NVIDIA | 131K | — | — | $0.10 | $0.40 | 82 | |
| ERNIE 4.5 21B A3B Thinking | Baidu | 131K | 66K | 50% | $0.07 | $0.28 | 85 | |
| Qwen3 VL 30B A3B Thinking | Alibaba | 131K | 33K | 25% | $0.13 | $1.56 | 79 | |
| Qwen3 VL 30B A3B Instruct | Alibaba | 131K | 33K | 25% | $0.13 | $0.52 | 79 | |
| Qwen3 VL 235B A22B Thinking | Alibaba | 131K | 33K | 25% | $0.26 | $2.60 | 70 | |
| Tongyi DeepResearch 30B A3B | Alibaba | 131K | 131K | 100% | $0.09 | $0.45 | 83 | |
| Qwen3 Next 80B A3B Thinking | Alibaba | 131K | 33K | 25% | $0.10 | $0.78 | 82 | |
| LongCat Flash Chat | Meituan | 131K | 131K | 100% | $0.20 | $0.80 | 74 | |
| Nemotron Nano 9B V2 | NVIDIA | 131K | — | — | $0.04 | $0.16 | 88 | |
| Kimi K2 0905 | Moonshot AI | 131K | — | — | $0.40 | $2.00 | 63 | |
| Qwen3 30B A3B Thinking 2507 | Alibaba | 131K | 131K | 100% | $0.08 | $0.40 | 84 | |
| Mistral Medium 3.1 | Mistral AI | 131K | — | — | $0.40 | $2.00 | 63 | |
| gpt-oss-120b (free) | OpenAI | 131K | 131K | 100% | Free | Free | 93 | |
| gpt-oss-120b | OpenAI | 131K | — | — | $0.04 | $0.19 | 88 | |
| gpt-oss-20b (free) | OpenAI | 131K | 131K | 100% | Free | Free | 93 | |
| gpt-oss-20b | OpenAI | 131K | 131K | 100% | $0.03 | $0.11 | 89 | |
| Qwen3 235B A22B Thinking 2507 | Alibaba | 131K | — | — | $0.15 | $1.50 | 77 | |
| Devstral Medium | Mistral AI | 131K | — | — | $0.40 | $2.00 | 63 | |
| Devstral Small 1.1 | Mistral AI | 131K | — | — | $0.10 | $0.30 | 82 | |
| Hunyuan A13B Instruct | Tencent | 131K | 131K | 100% | $0.14 | $0.57 | 78 | |
| Grok 3 Mini | xAI | 131K | — | — | $0.30 | $0.50 | 67 | |
| Grok 3 | xAI | 131K | — | — | $3.00 | $15.00 | 31 | |
| Mistral Medium 3 | Mistral AI | 131K | — | — | $0.40 | $2.00 | 63 | |
| Spotlight | arcee-ai | 131K | 66K | 50% | $0.18 | $0.18 | 75 | |
| Maestro Reasoning | arcee-ai | 131K | 32K | 24% | $0.90 | $3.30 | 48 | |
| Virtuoso Large | arcee-ai | 131K | 64K | 49% | $0.75 | $1.20 | 51 | |
| Qwen3 235B A22B | Alibaba | 131K | 8K | 6% | $0.45 | $1.82 | 60 | |
| Grok 3 Mini Beta | xAI | 131K | — | — | $0.30 | $0.50 | 67 | |
| Grok 3 Beta | xAI | 131K | — | — | $3.00 | $15.00 | 31 | |
| Llama 3.1 Nemotron Ultra 253B v1 | NVIDIA | 131K | — | — | $0.60 | $1.80 | 55 | |
| Mistral Small 3.1 24B | Mistral AI | 131K | 131K | 100% | $0.03 | $0.11 | 89 | |
| Gemma 3 4B | 131K | — | — | $0.04 | $0.08 | 88 | ||
| Gemma 3 12B | 131K | — | — | $0.04 | $0.13 | 88 | ||
| Gemma 3 27B (free) | 131K | 8K | 6% | Free | Free | 93 | ||
| Gemma 3 27B | 131K | 16K | 13% | $0.08 | $0.16 | 84 | ||
| QwQ 32B | Alibaba | 131K | 131K | 100% | $0.15 | $0.58 | 77 | |
| Llama Guard 3 8B | Meta | 131K | — | — | $0.02 | $0.06 | 90 | |
| Qwen VL Plus | Alibaba | 131K | 8K | 6% | $0.14 | $0.41 | 79 | |
| Aion-1.0 | aion-labs | 131K | 33K | 25% | $4.00 | $8.00 | 28 | |
| Aion-1.0-Mini | aion-labs | 131K | 33K | 25% | $0.70 | $1.40 | 53 | |
| Qwen VL Max | Alibaba | 131K | 33K | 25% | $0.52 | $2.08 | 58 | |
| Qwen-Turbo | Alibaba | 131K | 8K | 6% | $0.03 | $0.13 | 89 | |
| R1 Distill Llama 70B | DeepSeek | 131K | 16K | 13% | $0.70 | $0.80 | 53 | |
| Llama 3.3 70B Instruct | Meta | 131K | 16K | 13% | $0.10 | $0.32 | 82 | |
| Mistral Large 2411 | Mistral AI | 131K | — | — | $2.00 | $6.00 | 36 | |
| Mistral Large 2407 | Mistral AI | 131K | — | — | $2.00 | $6.00 | 36 | |
| Pixtral Large 2411 | Mistral AI | 131K | — | — | $2.00 | $6.00 | 36 | |
| Llama 3.1 Nemotron 70B Instruct | NVIDIA | 131K | 16K | 13% | $1.20 | $1.20 | 44 | |
| Llama 3.2 3B Instruct (free) | Meta | 131K | — | — | Free | Free | 93 | |
| Llama 3.2 11B Vision Instruct | Meta | 131K | 16K | 13% | $0.05 | $0.05 | 87 | |
| Llama 3.1 70B Instruct | Meta | 131K | — | — | $0.40 | $0.40 | 63 | |
| Mistral Nemo | Mistral AI | 131K | 16K | 13% | $0.02 | $0.04 | 90 | |
| Trinity Large Preview (free) | arcee-ai | 131K | — | — | Free | Free | 93 | |
| Granite 4.0 Micro | IBM | 131K | — | — | $0.02 | $0.11 | 91 | |
| Kimi K2 0711 | Moonshot AI | 131K | — | — | $0.55 | $2.20 | 57 | |
| Mercury 2 | Inception | 128K | 50K | 39% | $0.25 | $0.75 | 70 | |
| GPT-5.3 Chat | OpenAI | 128K | 16K | 13% | $1.75 | $14.00 | 38 | |
| Solar Pro 3 | Upstage | 128K | — | — | $0.15 | $0.60 | 77 | |
| GPT Audio | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 33 | |
| GPT Audio Mini | OpenAI | 128K | 16K | 13% | $0.60 | $2.40 | 55 | |
| GPT-5.2 Chat | OpenAI | 128K | 16K | 13% | $1.75 | $14.00 | 38 | |
| Cogito v2.1 671B | deepcogito | 128K | — | — | $1.25 | $1.25 | 43 | |
| GPT-5.1 Chat | OpenAI | 128K | 16K | 13% | $1.25 | $10.00 | 43 | |
| Nemotron Nano 12B 2 VL (free) | NVIDIA | 128K | 128K | 100% | Free | Free | 93 | |
| Nemotron Nano 9B V2 (free) | NVIDIA | 128K | — | — | Free | Free | 93 | |
| GPT-4o Audio | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 33 | |
| GPT-5 Chat | OpenAI | 128K | 16K | 13% | $1.25 | $10.00 | 43 | |
| UI-TARS 7B | ByteDance | 128K | 2K | 2% | $0.10 | $0.20 | 82 | |
| Mercury | Inception | 128K | 32K | 25% | $0.25 | $0.75 | 70 | |
| Mistral Small 3.2 24B | Mistral AI | 128K | — | — | $0.07 | $0.20 | 84 | |
| Mercury Coder | Inception | 128K | 32K | 25% | $0.25 | $0.75 | 70 | |
| Qwen2.5 VL 32B Instruct | Alibaba | 128K | — | — | $0.20 | $0.60 | 74 | |
| Mistral Small 3.1 24B (free) | Mistral AI | 128K | — | — | Free | Free | 93 | |
| Olmo 2 32B Instruct | Allen AI | 128K | — | — | $0.05 | $0.20 | 87 | |
| GPT-4o-mini Search Preview | OpenAI | 128K | 16K | 13% | $0.15 | $0.60 | 77 | |
| GPT-4o Search Preview | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 33 | |
| Sonar Reasoning Pro | Perplexity | 128K | — | — | $2.00 | $8.00 | 36 | |
| Sonar Deep Research | Perplexity | 128K | — | — | $2.00 | $8.00 | 36 | |
| Command R7B (12-2024) | Cohere | 128K | 4K | 3% | $0.04 | $0.15 | 88 | |
| Nova Micro 1.0 | Amazon | 128K | 5K | 4% | $0.04 | $0.14 | 88 | |
| GPT-4o (2024-11-20) | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 33 | |
| Command R (08-2024) | Cohere | 128K | 4K | 3% | $0.15 | $0.60 | 77 | |
| Command R+ (08-2024) | Cohere | 128K | 4K | 3% | $2.50 | $10.00 | 33 | |
| GPT-4o (2024-08-06) | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 33 | |
| GPT-4o-mini (2024-07-18) | OpenAI | 128K | 16K | 13% | $0.15 | $0.60 | 77 | |
| GPT-4o-mini | OpenAI | 128K | 16K | 13% | $0.15 | $0.60 | 77 | |
| GPT-4o (2024-05-13) | OpenAI | 128K | 4K | 3% | $5.00 | $15.00 | 26 | |
| GPT-4o | OpenAI | 128K | 16K | 13% | $2.50 | $10.00 | 33 | |
| GPT-4o (extended) | OpenAI | 128K | 64K | 50% | $6.00 | $18.00 | 24 | |
| GPT-4 Turbo | OpenAI | 128K | 4K | 3% | $10.00 | $30.00 | 21 | |
| Mistral Large | Mistral AI | 128K | — | — | $2.00 | $6.00 | 36 | |
| GPT-4 Turbo Preview | OpenAI | 128K | 4K | 3% | $10.00 | $30.00 | 21 | |
| GPT-4 Turbo (older v1106) | OpenAI | 128K | 4K | 3% | $10.00 | $30.00 | 21 | |
| Sonar | Perplexity | 127K | — | — | $1.00 | $1.00 | 46 | |
| ERNIE 4.5 VL 424B A47B | Baidu | 123K | 16K | 13% | $0.42 | $1.25 | 61 | |
| ERNIE 4.5 300B A47B | Baidu | 123K | 12K | 10% | $0.28 | $1.10 | 68 | |
| ERNIE 4.5 21B A3B | Baidu | 120K | 8K | 7% | $0.07 | $0.28 | 84 | |
| Llama 3.2 3B Instruct | Meta | 80K | — | — | $0.05 | $0.34 | 83 | |
| MiniMax M2-her | MiniMax | 66K | 2K | 3% | $0.30 | $1.20 | 64 | |
| Olmo 3.1 32B Instruct | Allen AI | 66K | — | — | $0.20 | $0.60 | 69 | |
| Olmo 3.1 32B Think | Allen AI | 66K | 66K | 100% | $0.15 | $0.50 | 73 | |
| Olmo 3 32B Think | Allen AI | 66K | 66K | 100% | $0.15 | $0.50 | 73 | |
| Llama 3.3 70B Instruct (free) | Meta | 66K | — | — | Free | Free | 88 | |
| Mixtral 8x22B Instruct | Mistral AI | 66K | — | — | $2.00 | $6.00 | 34 | |
| WizardLM-2 8x22B | Microsoft | 66K | 8K | 12% | $0.62 | $0.62 | 52 | |
| R1 | DeepSeek | 64K | 16K | 25% | $0.70 | $2.50 | 49 | |
| Llama 3.2 1B Instruct | Meta | 60K | — | — | $0.03 | $0.20 | 84 | |
| Qwen3 4B (free) | Alibaba | 41K | — | — | Free | Free | 84 | |
| Qwen3 30B A3B | Alibaba | 41K | 41K | 100% | $0.08 | $0.28 | 75 | |
| Qwen3 8B | Alibaba | 41K | 8K | 20% | $0.05 | $0.40 | 78 | |
| Qwen3 14B | Alibaba | 41K | 41K | 100% | $0.06 | $0.24 | 77 | |
| Qwen3 32B | Alibaba | 41K | 41K | 100% | $0.08 | $0.24 | 75 | |
| LFM2-24B-A2B | Liquid AI | 33K | — | — | $0.03 | $0.12 | 79 | |
| LFM2.5-1.2B-Thinking (free) | Liquid AI | 33K | — | — | Free | Free | 82 | |
| LFM2.5-1.2B-Instruct (free) | Liquid AI | 33K | — | — | Free | Free | 82 | |
| Mistral Small Creative | Mistral AI | 33K | — | — | $0.10 | $0.30 | 72 | |
| Rnj 1 Instruct | essentialai | 33K | — | — | $0.15 | $0.15 | 68 | |
| LFM2-8B-A1B | Liquid AI | 33K | — | — | $0.01 | $0.02 | 81 | |
| LFM2-2.6B | Liquid AI | 33K | — | — | $0.01 | $0.02 | 81 | |
| DeepSeek V3.1 | DeepSeek | 33K | 7K | 22% | $0.15 | $0.75 | 68 | |
| Gemma 3n 4B | 33K | — | — | $0.02 | $0.04 | 80 | ||
| Coder Large | arcee-ai | 33K | — | — | $0.50 | $0.80 | 52 | |
| Qwen2.5 Coder 7B Instruct | Alibaba | 33K | — | — | $0.03 | $0.09 | 79 | |
| Gemma 3 4B (free) | 33K | 8K | 25% | Free | Free | 82 | ||
| Gemma 3 12B (free) | 33K | 8K | 25% | Free | Free | 82 | ||
| Saba | Mistral AI | 33K | — | — | $0.20 | $0.60 | 65 | |
| Qwen2.5 VL 72B Instruct | Alibaba | 33K | 33K | 100% | $0.80 | $0.80 | 44 | |
| Qwen-Max | Alibaba | 33K | 8K | 25% | $1.04 | $4.16 | 40 | |
| Mistral Small 3 | Mistral AI | 33K | 16K | 50% | $0.05 | $0.08 | 77 | |
| R1 Distill Qwen 32B | DeepSeek | 33K | 33K | 100% | $0.29 | $0.29 | 60 | |
| Qwen2.5 Coder 32B Instruct | Alibaba | 33K | — | — | $0.66 | $1.00 | 47 | |
| Qwen2.5 7B Instruct | Alibaba | 33K | — | — | $0.04 | $0.10 | 78 | |
| Qwen2.5 72B Instruct | Alibaba | 33K | 16K | 50% | $0.12 | $0.39 | 71 | |
| Pixtral 12B | Mistral AI | 33K | — | — | $0.10 | $0.10 | 72 | |
| Qwen2.5-VL 7B Instruct | Alibaba | 33K | — | — | $0.20 | $0.20 | 65 | |
| Llama 3.1 405B (base) | Meta | 33K | 33K | 100% | $4.00 | $4.00 | 25 | |
| Mixtral 8x7B Instruct | Mistral AI | 33K | 16K | 50% | $0.54 | $0.54 | 51 | |
| ERNIE 4.5 VL 28B A3B | Baidu | 30K | 8K | 27% | $0.14 | $0.56 | 68 | |
| GPT-3.5 Turbo 16k | OpenAI | 16K | 4K | 25% | $3.00 | $4.00 | 26 | |
| GPT-3.5 Turbo | OpenAI | 16K | 4K | 25% | $0.50 | $1.50 | 48 | |
| Phi 4 | Microsoft | 16K | 16K | 100% | $0.07 | $0.14 | 70 | |
| Llama 3.1 8B Instruct | Meta | 16K | 16K | 100% | $0.02 | $0.05 | 74 | |
| Gemma 3n 2B (free) | 8K | 2K | 25% | Free | Free | 71 | ||
| Gemma 3n 4B (free) | 8K | 2K | 25% | Free | Free | 71 | ||
| Gemma 2 27B | 8K | 2K | 25% | $0.65 | $0.65 | 41 | ||
| Gemma 2 9B | 8K | — | — | $0.03 | $0.09 | 68 | ||
| Llama 3 70B Instruct | Meta | 8K | 8K | 98% | $0.51 | $0.74 | 45 | |
| Llama 3 8B Instruct | Meta | 8K | 16K | 200% | $0.03 | $0.04 | 68 | |
| GPT-4 (older v0314) | OpenAI | 8K | 4K | 50% | $30.00 | $60.00 | 12 | |
| GPT-4 | OpenAI | 8K | 4K | 50% | $30.00 | $60.00 | 12 | |
| Inflection 3 Pi | Inflection | 8K | 1K | 13% | $2.50 | $10.00 | 25 | |
| Inflection 3 Productivity | Inflection | 8K | 1K | 13% | $2.50 | $10.00 | 25 | |
| Llemma 7b | eleutherai | 4K | 4K | 100% | $0.80 | $1.20 | 36 | |
| GPT-3.5 Turbo (older v0613) | OpenAI | 4K | 4K | 100% | $1.00 | $2.00 | 33 | |
| GPT-3.5 Turbo Instruct | OpenAI | 4K | 4K | 100% | $1.50 | $2.00 | 28 | |
| Mistral 7B Instruct v0.1 | Mistral AI | 3K | — | — | $0.11 | $0.19 | 55 | |
| SWE-1.5 | Windsurf | — | — | — | Free | Free | 0 | |
| autofixer-01 | Vercel | — | — | — | Free | Free | 0 | |
| Mellum | JetBrains | — | — | — | Free | Free | 0 |
| Model | Provider | Input $/1M | Output $/1M | Capabilities |
|---|---|---|---|---|
| GPT-5 Image | OpenAI | $10.00 | $10.00 | |
| GPT-5 Image Mini | OpenAI | $2.50 | $2.00 | |
| Nano Banana Pro (Gemini 3 Pro Image Preview) | $2.00 | $12.00 | ||
| Nano Banana 2 (Gemini 3.1 Flash Image Preview) | $0.50 | $3.00 | ||
| Nano Banana (Gemini 2.5 Flash Image) | $0.30 | $2.50 | ||
| Midjourney v6.1 | Midjourney | Free | Free | |
| DALL-E 3 | OpenAI | Free | $40000.00 | |
| Stable Diffusion 3.5 | Stability AI | Free | $35000.00 | |
| FLUX.1 Pro | Black Forest Labs | Free | $50000.00 | |
| Ideogram 2.0 | Ideogram | Free | $80000.00 | |
| Recraft V3 | Recraft | Free | $40000.00 | |
| Imagen 3 | Free | $40000.00 | ||
| Adobe Firefly 3 | Adobe | Free | Free | |
| Leonardo Phoenix | Leonardo AI | Free | Free |
AI speed is measured by time-to-first-token (TTFT) and tokens-per-second (TPS). TTFT measures how quickly the model starts responding. TPS measures how fast it generates output. Both matter for different use cases.
Speed varies by provider. Groq-hosted Llama achieves the fastest inference. Among major providers, Gemini Flash and GPT-4o Mini are consistently fast. Reasoning models like o3 and R1 are intentionally slower for better accuracy.
Smaller, faster models may sacrifice some quality. However, provider optimizations (quantization, speculative decoding) can speed up models without quality loss. The same model runs at different speeds on different providers.