A comprehensive comparison of AI API pricing across 298+ models from 35+ providers. Understand token costs, compare price tiers, and estimate real-world expenses for LLM API usage in 2026.
How 298 models break down across pricing tiers (based on output cost per 1M tokens).
| Provider | Models | Avg Input $/1M | Avg Output $/1M |
|---|---|---|---|
| Liquid AI | 5 | $0.010 | $0.032 |
| StepFun | 2 | $0.050 | $0.150 |
| NVIDIA | 8 | $0.199 | $0.320 |
| Allen AI | 7 | $0.139 | $0.343 |
| Microsoft | 2 | $0.340 | $0.380 |
| Meta | 17 | $0.577 | $0.675 |
| Baidu | 5 | $0.196 | $0.694 |
| ByteDance | 4 | $0.131 | $0.725 |
| Inception | 3 | $0.250 | $0.750 |
| arcee-ai | 7 | $0.339 | $0.804 |
| DeepSeek | 12 | $0.346 | $0.978 |
| Alibaba | 51 | $0.223 | $0.996 |
| MiniMax | 6 | $0.287 | $1.28 |
| Mistral AI | 25 | $0.579 | $1.76 |
| Moonshot AI | 5 | $0.494 | $2.18 |
| 27 | $0.548 | $3.45 | |
| aion-labs | 3 | $1.83 | $3.67 |
| Amazon | 5 | $0.739 | $3.72 |
| Cohere | 4 | $1.30 | $5.19 |
| xAI | 8 | $1.27 | $6.06 |
| Perplexity | 5 | $2.20 | $9.40 |
| Inflection | 2 | $2.50 | $10.00 |
| Anthropic | 13 | $4.85 | $24.25 |
| OpenAI | 61 | $7.02 | $29.19 |
128 scored models fall under $1/1M output tokens with an average composite score of 44. The top-scoring budget model is Gemini 2.5 Flash Lite Preview 09-2025 at score 65 for just $0.400/1M output.
30 scored models cost $15+/1M output tokens with an average score of 62. Premium models score on average 40% higher than budget models, but cost significantly more per token. The trade-off is meaningful for high-stakes tasks like code generation and complex reasoning.
Key insight: Price and quality correlate but diminishing returns are significant. Many models in the $1--$5 range deliver 80--90% of the performance of premium models at a fraction of the cost. For most production workloads, mid-range models offer the best balance.
| Model | Input $/1M | Output $/1M |
|---|---|---|
| Step 3.5 Flash (free)StepFun | Free | Free |
| Trinity Large Preview (free)arcee-ai | Free | Free |
| LFM2.5-1.2B-Thinking (free)Liquid AI | Free | Free |
| LFM2.5-1.2B-Instruct (free)Liquid AI | Free | Free |
| Nemotron 3 Nano 30B A3B (free)NVIDIA | Free | Free |
| Trinity Mini (free)arcee-ai | Free | Free |
| Nemotron Nano 12B 2 VL (free)NVIDIA | Free | Free |
| Qwen3 VL 30B A3B ThinkingAlibaba | Free | Free |
| Qwen3 VL 235B A22B ThinkingAlibaba | Free | Free |
| Qwen3 Next 80B A3B Instruct (free)Alibaba | Free | Free |
| Nemotron Nano 9B V2 (free)NVIDIA | Free | Free |
| gpt-oss-120b (free)OpenAI | Free | Free |
| gpt-oss-20b (free)OpenAI | Free | Free |
| Qwen3 235B A22B Thinking 2507Alibaba | Free | Free |
| Qwen3 Coder 480B A35B (free)Alibaba | Free | Free |
| Gemma 3n 2B (free)Google | Free | Free |
| Gemma 3n 4B (free)Google | Free | Free |
| Qwen3 4B (free)Alibaba | Free | Free |
| Mistral Small 3.1 24B (free)Mistral AI | Free | Free |
| Gemma 3 4B (free)Google | Free | Free |
| Gemma 3 12B (free)Google | Free | Free |
| Gemma 3 27B (free)Google | Free | Free |
| Llama 3.3 70B Instruct (free)Meta | Free | Free |
| Llama 3.2 3B Instruct (free)Meta | Free | Free |
| LFM2-8B-A1BLiquid AI | $0.010 | $0.020 |
| LFM2-2.6BLiquid AI | $0.010 | $0.020 |
| Gemma 3n 4BGoogle | $0.020 | $0.040 |
| Mistral NemoMistral AI | $0.020 | $0.040 |
| Llama 3 8B InstructMeta | $0.030 | $0.040 |
| Llama 3.2 11B Vision InstructMeta | $0.049 | $0.049 |
| Llama 3.1 8B InstructMeta | $0.020 | $0.050 |
| Llama Guard 3 8BMeta | $0.020 | $0.060 |
| Gemma 3 4BGoogle | $0.040 | $0.080 |
| Mistral Small 3Mistral AI | $0.050 | $0.080 |
| Qwen2.5 Coder 7B InstructAlibaba | $0.030 | $0.090 |
| Gemma 2 9BGoogle | $0.030 | $0.090 |
| Ministral 3 3B 2512Mistral AI | $0.100 | $0.100 |
| Qwen3 235B A22B Instruct 2507Alibaba | $0.071 | $0.100 |
| Qwen2.5 7B InstructAlibaba | $0.040 | $0.100 |
| Granite 4.0 MicroIBM | $0.017 | $0.110 |
| LFM2-24B-A2BLiquid AI | $0.030 | $0.120 |
| Gemma 3 12BGoogle | $0.040 | $0.130 |
| Qwen-TurboAlibaba | $0.033 | $0.130 |
| gpt-oss-20bOpenAI | $0.030 | $0.140 |
| Phi 4Microsoft | $0.060 | $0.140 |
| Nova Micro 1.0Amazon | $0.035 | $0.140 |
| Rnj 1 Instructessentialai | $0.150 | $0.150 |
| Ministral 3 8B 2512Mistral AI | $0.150 | $0.150 |
| Trinity Miniarcee-ai | $0.045 | $0.150 |
| Gemma 3 27BGoogle | $0.040 | $0.150 |
| Command R7B (12-2024)Cohere | $0.037 | $0.150 |
| Nemotron Nano 9B V2NVIDIA | $0.040 | $0.160 |
| Mistral Small 3.2 24BMistral AI | $0.060 | $0.180 |
| Spotlightarcee-ai | $0.180 | $0.180 |
| Llama Guard 4 12BMeta | $0.180 | $0.180 |
| gpt-oss-120bOpenAI | $0.039 | $0.190 |
| gpt-oss-120b (exacto)OpenAI | $0.039 | $0.190 |
| Mistral 7B Instruct v0.1Mistral AI | $0.110 | $0.190 |
| Molmo2 8BAllen AI | $0.200 | $0.200 |
| Nemotron 3 Nano 30B A3BNVIDIA | $0.050 | $0.200 |
| Ministral 3 14B 2512Mistral AI | $0.200 | $0.200 |
| Olmo 3 7B InstructAllen AI | $0.100 | $0.200 |
| Olmo 3 7B ThinkAllen AI | $0.120 | $0.200 |
| UI-TARS 7B ByteDance | $0.100 | $0.200 |
| Olmo 2 32B InstructAllen AI | $0.050 | $0.200 |
| Llama 3.2 1B InstructMeta | $0.027 | $0.200 |
| LlamaGuard 2 8BMeta | $0.200 | $0.200 |
| Qwen2.5 Coder 32B InstructAlibaba | $0.200 | $0.200 |
| Qwen2.5-VL 7B InstructAlibaba | $0.200 | $0.200 |
| Qwen3 14BAlibaba | $0.060 | $0.240 |
| Qwen3 32BAlibaba | $0.080 | $0.240 |
| Nova Lite 1.0Amazon | $0.060 | $0.240 |
| Qwen3 Coder 30B A3B InstructAlibaba | $0.070 | $0.270 |
| ERNIE 4.5 21B A3B ThinkingBaidu | $0.070 | $0.280 |
| ERNIE 4.5 21B A3BBaidu | $0.070 | $0.280 |
| Qwen3 30B A3BAlibaba | $0.080 | $0.280 |
| MiMo-V2-FlashXiaomi | $0.090 | $0.290 |
| R1 Distill Qwen 32BDeepSeek | $0.290 | $0.290 |
| Step 3.5 FlashStepFun | $0.100 | $0.300 |
| Seed 1.6 FlashByteDance | $0.075 | $0.300 |
| Mistral Small CreativeMistral AI | $0.100 | $0.300 |
| Voxtral Small 24B 2507Mistral AI | $0.100 | $0.300 |
| gpt-oss-safeguard-20bOpenAI | $0.075 | $0.300 |
| Qwen3 30B A3B Instruct 2507Alibaba | $0.090 | $0.300 |
| Devstral Small 1.1Mistral AI | $0.100 | $0.300 |
| Llama 4 ScoutMeta | $0.080 | $0.300 |
| Gemini 2.0 Flash LiteGoogle | $0.075 | $0.300 |
| Llama 3.3 70B InstructMeta | $0.100 | $0.320 |
| Qwen3 30B A3B Thinking 2507Alibaba | $0.051 | $0.340 |
| Llama 3.2 3B InstructMeta | $0.051 | $0.340 |
| Qwen2.5 72B InstructAlibaba | $0.120 | $0.390 |
| Seed-2.0-MiniByteDance | $0.100 | $0.400 |
| Qwen3.5-FlashAlibaba | $0.100 | $0.400 |
| DeepSeek V3.2DeepSeek | $0.250 | $0.400 |
| Llama 3.3 Nemotron Super 49B V1.5NVIDIA | $0.100 | $0.400 |
| Gemini 2.5 Flash Lite Preview 09-2025Google | $0.100 | $0.400 |
| GPT-5 NanoOpenAI | $0.050 | $0.400 |
| Gemini 2.5 Flash LiteGoogle | $0.100 | $0.400 |
| Qwen3 8BAlibaba | $0.050 | $0.400 |
| GPT-4.1 NanoOpenAI | $0.100 | $0.400 |
| QwQ 32BAlibaba | $0.150 | $0.400 |
| Gemini 2.0 FlashGoogle | $0.100 | $0.400 |
| Llama 3.1 70B InstructMeta | $0.400 | $0.400 |
| Qwen VL PlusAlibaba | $0.137 | $0.410 |
| DeepSeek V3.2 ExpDeepSeek | $0.270 | $0.410 |
| Qwen3 VL 32B InstructAlibaba | $0.104 | $0.416 |
| Tongyi DeepResearch 30B A3BAlibaba | $0.090 | $0.450 |
| Olmo 3.1 32B ThinkAllen AI | $0.150 | $0.500 |
| Olmo 3 32B ThinkAllen AI | $0.150 | $0.500 |
| Grok 4.1 FastxAI | $0.200 | $0.500 |
| Qwen3 VL 8B InstructAlibaba | $0.080 | $0.500 |
| Grok 4 FastxAI | $0.200 | $0.500 |
| Grok 3 MinixAI | $0.300 | $0.500 |
| Grok 3 Mini BetaxAI | $0.300 | $0.500 |
| Qwen3 VL 30B A3B InstructAlibaba | $0.130 | $0.520 |
| Mixtral 8x7B InstructMistral AI | $0.540 | $0.540 |
| ERNIE 4.5 VL 28B A3BBaidu | $0.140 | $0.560 |
| Mistral Small 3.1 24BMistral AI | $0.350 | $0.560 |
| Hunyuan A13B InstructTencent | $0.140 | $0.570 |
| Solar Pro 3Upstage | $0.150 | $0.600 |
| Olmo 3.1 32B InstructAllen AI | $0.200 | $0.600 |
| Nemotron Nano 12B 2 VLNVIDIA | $0.200 | $0.600 |
| Llama 4 MaverickMeta | $0.150 | $0.600 |
| Qwen2.5 VL 32B InstructAlibaba | $0.200 | $0.600 |
| GPT-4o-mini Search PreviewOpenAI | $0.150 | $0.600 |
| SabaMistral AI | $0.200 | $0.600 |
| Command R (08-2024)Cohere | $0.150 | $0.600 |
| GPT-4o-mini (2024-07-18)OpenAI | $0.150 | $0.600 |
| GPT-4o-miniOpenAI | $0.150 | $0.600 |
| WizardLM-2 8x22BMicrosoft | $0.620 | $0.620 |
| Gemma 2 27BGoogle | $0.650 | $0.650 |
| Llama 3 70B InstructMeta | $0.510 | $0.740 |
| Mercury 2Inception | $0.250 | $0.750 |
| Qwen3 Coder NextAlibaba | $0.120 | $0.750 |
| DeepSeek V3.1DeepSeek | $0.150 | $0.750 |
| MercuryInception | $0.250 | $0.750 |
| Mercury CoderInception | $0.250 | $0.750 |
| DeepSeek V3 0324DeepSeek | $0.200 | $0.770 |
| Qwen Plus 0728 (thinking)Alibaba | $0.260 | $0.780 |
| Qwen Plus 0728Alibaba | $0.260 | $0.780 |
| DeepSeek V3.1 Terminus (exacto)DeepSeek | $0.210 | $0.790 |
| DeepSeek V3.1 TerminusDeepSeek | $0.210 | $0.790 |
| LongCat Flash ChatMeituan | $0.200 | $0.800 |
| Coder Largearcee-ai | $0.500 | $0.800 |
| Qwen2.5 VL 72B InstructAlibaba | $0.800 | $0.800 |
| R1 Distill Llama 70BDeepSeek | $0.700 | $0.800 |
| KAT-Coder-Pro V1Kuaishou | $0.207 | $0.828 |
| Qwen3 VL 235B A22B InstructAlibaba | $0.200 | $0.880 |
| DeepSeek V3DeepSeek | $0.320 | $0.890 |
| Codestral 2508Mistral AI | $0.300 | $0.900 |
| MiniMax M2.1MiniMax | $0.270 | $0.950 |
| Qwen3 Coder FlashAlibaba | $0.195 | $0.975 |
| MiniMax M2MiniMax | $0.255 | $1.00 |
| Qwen3 Coder 480B A35BAlibaba | $0.220 | $1.00 |
| SonarPerplexity | $1.00 | $1.00 |
| Qwen3 Next 80B A3B InstructAlibaba | $0.090 | $1.10 |
| ERNIE 4.5 300B A47B Baidu | $0.280 | $1.10 |
| MiniMax-01MiniMax | $0.200 | $1.10 |
| MiniMax M2.5MiniMax | $0.295 | $1.20 |
| MiniMax M2-herMiniMax | $0.300 | $1.20 |
| DeepSeek V3.2 SpecialeDeepSeek | $0.400 | $1.20 |
| Qwen3 Next 80B A3B ThinkingAlibaba | $0.150 | $1.20 |
| Virtuoso Largearcee-ai | $0.750 | $1.20 |
| Llemma 7beleutherai | $0.800 | $1.20 |
| Qwen-PlusAlibaba | $0.400 | $1.20 |
| Llama 3.1 Nemotron 70B InstructNVIDIA | $1.20 | $1.20 |
| Cogito v2.1 671Bdeepcogito | $1.25 | $1.25 |
| ERNIE 4.5 VL 424B A47B Baidu | $0.420 | $1.25 |
| Claude 3 HaikuAnthropic | $0.250 | $1.25 |
| Qwen3.5-35B-A3BAlibaba | $0.163 | $1.30 |
| Qwen3 VL 8B ThinkingAlibaba | $0.117 | $1.36 |
| Aion-1.0-Miniaion-labs | $0.700 | $1.40 |
| Gemini 3.1 Flash Lite PreviewGoogle | $0.250 | $1.50 |
| Mistral Large 3 2512Mistral AI | $0.500 | $1.50 |
| Grok Code Fast 1xAI | $0.200 | $1.50 |
| GPT-3.5 TurboOpenAI | $0.500 | $1.50 |
| Qwen3.5-27BAlibaba | $0.195 | $1.56 |
| Qwen3.5 Plus 2026-02-15Alibaba | $0.260 | $1.56 |
| Aion-2.0aion-labs | $0.800 | $1.60 |
| GPT-4.1 MiniOpenAI | $0.400 | $1.60 |
| Qwen3 Coder 480B A35B (exacto)Alibaba | $0.220 | $1.80 |
| Qwen3 235B A22BAlibaba | $0.455 | $1.82 |
| Seed 1.6ByteDance | $0.250 | $2.00 |
| Devstral 2 2512Mistral AI | $0.400 | $2.00 |
| GPT-5.1-Codex-MiniOpenAI | $0.250 | $2.00 |
| Kimi K2 ThinkingMoonshot AI | $0.470 | $2.00 |
| GPT-5 Image MiniOpenAI | $2.50 | $2.00 |
| Kimi K2 0905Moonshot AI | $0.400 | $2.00 |
| Mistral Medium 3.1Mistral AI | $0.400 | $2.00 |
| GPT-5 MiniOpenAI | $0.250 | $2.00 |
| Devstral MediumMistral AI | $0.400 | $2.00 |
| Mistral Medium 3Mistral AI | $0.400 | $2.00 |
| GPT-3.5 Turbo (older v0613)OpenAI | $1.00 | $2.00 |
| GPT-3.5 Turbo InstructOpenAI | $1.50 | $2.00 |
| Qwen3.5-122B-A10BAlibaba | $0.260 | $2.08 |
| R1 0528DeepSeek | $0.450 | $2.15 |
| Kimi K2.5Moonshot AI | $0.450 | $2.20 |
| Kimi K2 0711Moonshot AI | $0.550 | $2.20 |
| MiniMax M1MiniMax | $0.400 | $2.20 |
| Qwen3.5 397B A17BAlibaba | $0.390 | $2.34 |
| GPT Audio MiniOpenAI | $0.600 | $2.40 |
| Nova 2 LiteAmazon | $0.300 | $2.50 |
| Nano Banana (Gemini 2.5 Flash Image)Google | $0.300 | $2.50 |
| Kimi K2 0905 (exacto)Moonshot AI | $0.600 | $2.50 |
| Gemini 2.5 FlashGoogle | $0.300 | $2.50 |
| R1DeepSeek | $0.700 | $2.50 |
| Nano Banana 2 (Gemini 3.1 Flash Image Preview)Google | $0.500 | $3.00 |
| Gemini 3 Flash PreviewGoogle | $0.500 | $3.00 |
| Qwen VL MaxAlibaba | $0.800 | $3.20 |
| Nova Pro 1.0Amazon | $0.800 | $3.20 |
| Qwen3 Coder PlusAlibaba | $0.650 | $3.25 |
| Maestro Reasoningarcee-ai | $0.900 | $3.30 |
| Qwen3 Max ThinkingAlibaba | $0.780 | $3.90 |
| Claude 3.5 HaikuAnthropic | $0.800 | $4.00 |
| Llama 3.1 405B (base)Meta | $4.00 | $4.00 |
| Llama 3.1 405B InstructMeta | $4.00 | $4.00 |
| GPT-3.5 Turbo 16kOpenAI | $3.00 | $4.00 |
| Qwen-Max Alibaba | $1.04 | $4.16 |
| o4 Mini HighOpenAI | $1.10 | $4.40 |
| o4 MiniOpenAI | $1.10 | $4.40 |
| o3 Mini HighOpenAI | $1.10 | $4.40 |
| o3 MiniOpenAI | $1.10 | $4.40 |
| Claude Haiku 4.5Anthropic | $1.00 | $5.00 |
| Palmyra X5Writer | $0.600 | $6.00 |
| Qwen3 MaxAlibaba | $1.20 | $6.00 |
| Mistral Large 2411Mistral AI | $2.00 | $6.00 |
| Mistral Large 2407Mistral AI | $2.00 | $6.00 |
| Pixtral Large 2411Mistral AI | $2.00 | $6.00 |
| Mixtral 8x22B InstructMistral AI | $2.00 | $6.00 |
| Mistral LargeMistral AI | $2.00 | $6.00 |
| o4 Mini Deep ResearchOpenAI | $2.00 | $8.00 |
| Jamba Large 1.7AI21 Labs | $2.00 | $8.00 |
| o3OpenAI | $2.00 | $8.00 |
| GPT-4.1OpenAI | $2.00 | $8.00 |
| Sonar Reasoning ProPerplexity | $2.00 | $8.00 |
| Sonar Deep ResearchPerplexity | $2.00 | $8.00 |
| Aion-1.0aion-labs | $4.00 | $8.00 |
| GPT AudioOpenAI | $2.50 | $10.00 |
| GPT-5.1-Codex-MaxOpenAI | $1.25 | $10.00 |
| GPT-5.1OpenAI | $1.25 | $10.00 |
| GPT-5.1 ChatOpenAI | $1.25 | $10.00 |
| GPT-5.1-CodexOpenAI | $1.25 | $10.00 |
| GPT-5 ImageOpenAI | $10.00 | $10.00 |
| GPT-5 CodexOpenAI | $1.25 | $10.00 |
| GPT-4o AudioOpenAI | $2.50 | $10.00 |
| GPT-5 ChatOpenAI | $1.25 | $10.00 |
| GPT-5OpenAI | $1.25 | $10.00 |
| Gemini 2.5 ProGoogle | $1.25 | $10.00 |
| Gemini 2.5 Pro Preview 06-05Google | $1.25 | $10.00 |
| Gemini 2.5 Pro Preview 05-06Google | $1.25 | $10.00 |
| Command ACohere | $2.50 | $10.00 |
| GPT-4o Search PreviewOpenAI | $2.50 | $10.00 |
| GPT-4o (2024-11-20)OpenAI | $2.50 | $10.00 |
| Inflection 3 PiInflection | $2.50 | $10.00 |
| Inflection 3 ProductivityInflection | $2.50 | $10.00 |
| Command R+ (08-2024)Cohere | $2.50 | $10.00 |
| GPT-4o (2024-08-06)OpenAI | $2.50 | $10.00 |
| GPT-4oOpenAI | $2.50 | $10.00 |
| Gemini 3.1 Pro Preview Custom ToolsGoogle | $2.00 | $12.00 |
| Gemini 3.1 Pro PreviewGoogle | $2.00 | $12.00 |
| Nano Banana Pro (Gemini 3 Pro Image Preview)Google | $2.00 | $12.00 |
| Gemini 3 Pro PreviewGoogle | $2.00 | $12.00 |
| Nova Premier 1.0Amazon | $2.50 | $12.50 |
| GPT-5.3 ChatOpenAI | $1.75 | $14.00 |
| GPT-5.3-CodexOpenAI | $1.75 | $14.00 |
| GPT-5.2-CodexOpenAI | $1.75 | $14.00 |
| GPT-5.2 ChatOpenAI | $1.75 | $14.00 |
| GPT-5.2OpenAI | $1.75 | $14.00 |
| GPT-5.4OpenAI | $2.50 | $15.00 |
| Claude Sonnet 4.6Anthropic | $3.00 | $15.00 |
| Sonar Pro SearchPerplexity | $3.00 | $15.00 |
| Claude Sonnet 4.5Anthropic | $3.00 | $15.00 |
| Grok 4xAI | $3.00 | $15.00 |
| Grok 3xAI | $3.00 | $15.00 |
| Claude Sonnet 4Anthropic | $3.00 | $15.00 |
| Grok 3 BetaxAI | $3.00 | $15.00 |
| Sonar ProPerplexity | $3.00 | $15.00 |
| Claude 3.7 SonnetAnthropic | $3.00 | $15.00 |
| Claude 3.7 Sonnet (thinking)Anthropic | $3.00 | $15.00 |
| GPT-4o (2024-05-13)OpenAI | $5.00 | $15.00 |
| GPT-4o (extended)OpenAI | $6.00 | $18.00 |
| Claude Opus 4.6Anthropic | $5.00 | $25.00 |
| Claude Opus 4.5Anthropic | $5.00 | $25.00 |
| Claude 3.5 SonnetAnthropic | $6.00 | $30.00 |
| GPT-4 TurboOpenAI | $10.00 | $30.00 |
| GPT-4 Turbo PreviewOpenAI | $10.00 | $30.00 |
| GPT-4 Turbo (older v1106)OpenAI | $10.00 | $30.00 |
| o3 Deep ResearchOpenAI | $10.00 | $40.00 |
| o1OpenAI | $15.00 | $60.00 |
| GPT-4 (older v0314)OpenAI | $30.00 | $60.00 |
| GPT-4OpenAI | $30.00 | $60.00 |
| Claude Opus 4.1Anthropic | $15.00 | $75.00 |
| Claude Opus 4Anthropic | $15.00 | $75.00 |
| o3 ProOpenAI | $20.00 | $80.00 |
| GPT-5 ProOpenAI | $15.00 | $120.00 |
| GPT-5.2 ProOpenAI | $21.00 | $168.00 |
| GPT-5.4 ProOpenAI | $30.00 | $180.00 |
| o1-proOpenAI | $150.00 | $600.00 |
AI APIs charge per token, not per request. A token is roughly 3/4 of a word in English. For example, the sentence "Hello, how are you?" is about 6 tokens. Prices are quoted per million tokens (1M tokens is approximately 750,000 words).
Input tokens are what you send to the model (prompts, context, system instructions). Output tokens are what the model generates. Output tokens are typically 2--5x more expensive than input tokens because they require more computation.
Some providers offer batch pricing at 50% discount for non-time-sensitive workloads. Batch requests are queued and processed within a 24-hour window, making them ideal for data processing, content generation pipelines, and evaluation runs.
Providers like Anthropic and OpenAI offer prompt caching, which reduces input costs by up to 90% for repeated prefixes. If you send the same system prompt across requests, cached tokens are charged at a fraction of the standard rate.
Estimated daily costs for common use cases across different price points.
500 conversations/day, ~800 input + 400 output tokens each
100 articles/day, ~2,000 input + 1,500 output tokens each
1,000 completions/day, ~500 input + 200 output tokens each