Best AI Models for Warp

CLI Agent

Warp is an AI-powered terminal with built-in command suggestions and natural language to shell translation. Fast, cheap models with strong instruction following work best.

Last updated: just now

Grok 4.1 Fast

xAI

Tool Score

Output $/M

$0.500

Arena Elo

1473

Nemotron 3 Super (free)

NVIDIA

Tool Score

Output $/M

Free

MiniMax M2.5 (free)

MiniMax

Tool Score

Output $/M

Free

What Matters for Warp

StreamingFunction CallingCost EfficientStrong Coding

All Models Ranked for Warp (303 models)

Scored by: coding benchmarks (50%), capability match (25%), price (15%), context (10%).

#	Model	Provider	Score	Coding	Caps	Output $/M	Context
1	Grok 4.1 Fast Arena Elo: 1473	xAI	90	96	100%	$0.500	2.0M
2	Nemotron 3 Super (free)	NVIDIA	87	84	100%	Free	262K
3	MiniMax M2.5 (free)	MiniMax	87	83	100%	Free	197K
4	Llama 4 Maverick HumanEval: 89.5%	Meta	87	90	100%	$0.600	1.0M
5	Gemini 2.0 Flash HumanEval: 89.4%	Google	87	89	100%	$0.400	1.0M
6	GPT-5.4 Mini	OpenAI	86	93	100%	$4.50	400K
7	Grok 4.20 Beta Arena Elo: 1496	xAI	86	99	100%	$6.00	2.0M
8	DeepSeek V3.2 Arena Elo: 1423	DeepSeek	86	87	100%	$0.380	164K
9	Nemotron Nano 12B 2 VL (free)	NVIDIA	86	82	100%	Free	128K
10	DeepSeek V3.2 Exp Arena Elo: 1424	DeepSeek	86	87	100%	$0.410	164K
11	Grok 4 Fast Arena Elo: 1422	xAI	86	87	100%	$0.500	2.0M
12	Llama 3.3 70B Instruct HumanEval: 88.4%	Meta	86	88	100%	$0.320	131K
13	GPT-4o-mini HumanEval: 87.2%	OpenAI	86	87	100%	$0.600	128K
14	Seed-2.0-Mini	ByteDance	85	85	100%	$0.400	262K
15	Gemini 3.1 Pro Preview Arena Elo: 1492	Google	85	99	100%	$12.00	1.0M
16	Qwen3.5 397B A17B Arena Elo: 1450	Alibaba	85	92	100%	$2.34	262K
17	Seed 1.6 Flash	ByteDance	85	85	100%	$0.300	262K
18	Gemini 3 Flash Preview HumanEval: 92%	Google	85	92	100%	$3.00	1.0M
19	Qwen3 VL 235B A22B Instruct Arena Elo: 1416	Alibaba	85	86	100%	$0.880	262K
20	DeepSeek V3.1 Terminus Arena Elo: 1417	DeepSeek	85	86	100%	$0.790	164K
21	DeepSeek V3.1 Arena Elo: 1419	DeepSeek	85	87	100%	$0.750	33K
22	Gemini 3.1 Flash Lite Preview Arena Elo: 1437	Google	84	90	100%	$1.50	1.0M
23	Qwen3.5-Flash Arena Elo: 1400	Alibaba	84	83	100%	$0.260	1.0M
24	Step 3.5 Flash (free)	StepFun	84	78	100%	Free	256K
25	GPT-5.2 Chat Arena Elo: 1481	OpenAI	84	97	100%	$14.00	128K
26	Gemini 2.5 Flash Lite Preview 09-2025	Google	84	84	100%	$0.400	1.0M
27	LongCat Flash Chat Arena Elo: 1401	Meituan	84	84	100%	$0.800	131K
28	DeepSeek V3 0324 HumanEval: 84.5%	DeepSeek	84	85	100%	$0.770	164K
29	Mercury 2	Inception	83	81	100%	$0.750	128K
30	Step 3.5 Flash Arena Elo: 1389	StepFun	83	82	100%	$0.300	256K
31	MiMo-V2-Flash	Xiaomi	83	83	100%	$0.290	262K
32	Trinity Mini	arcee-ai	83	82	100%	$0.150	131K
33	gpt-oss-safeguard-20b	OpenAI	83	82	100%	$0.300	131K
34	Tongyi DeepResearch 30B A3B	Alibaba	83	82	100%	$0.450	131K
35	Qwen Plus 0728 (thinking)	Alibaba	83	83	100%	$0.780	1.0M
36	Gemini 2.5 Flash Lite	Google	83	81	100%	$0.400	1.0M
37	DeepSeek V3 HumanEval: 82.6%	DeepSeek	83	83	100%	$0.890	164K
38	Claude 3.5 Haiku HumanEval: 88.1%	Anthropic	83	88	100%	$4.00	200K
39	MiMo-V2-Omni	Xiaomi	82	85	100%	$2.00	262K
40	MiMo-V2-Pro	Xiaomi	82	85	100%	$3.00	1.0M
41	GPT-5.4 Nano	OpenAI	82	85	100%	$1.25	400K
42	Mistral Small 4	Mistral AI	82	79	100%	$0.600	262K
43	Seed-2.0-Lite	ByteDance	82	85	100%	$2.00	262K
44	Qwen3.5-9B	Alibaba	82	79	100%	$0.150	256K
45	Qwen3.5-27B Arena Elo: 1410	Alibaba	82	85	100%	$1.56	262K
46	Qwen3.5-122B-A10B Arena Elo: 1419	Alibaba	82	87	100%	$2.08	262K
47	Qwen3.5 Plus 2026-02-15	Alibaba	82	85	100%	$1.56	1.0M
48	Kimi K2.5	Moonshot AI	82	85	100%	$2.20	262K
49	Seed 1.6	ByteDance	82	85	100%	$2.00	262K
50	GPT-5.1 Arena Elo: 1456	OpenAI	82	93	100%	$10.00	400K

More Tool Rankings

Cursor Claude Code Windsurf GitHub Copilot Aider Cline Roo Code Open WebUI Continue Zed Lovable

Best for Coding Best for Reasoning Compare Models

Frequently Asked Questions

Based on our analysis of coding benchmarks, capability matching, and pricing, Grok 4.1 Fast currently ranks #1 for Warp. Rankings are updated hourly using real benchmark data.

We score models using a weighted formula: coding benchmarks like SWE-bench and HumanEval (50%), capability match for Warp's requirements (25%), pricing affordability (15%), and context window size (10%). Only models with the capabilities Warp needs are included.

We currently track 303 AI models compatible with Warp. This includes models from OpenAI, Anthropic, Google, DeepSeek, and other providers accessible via API.

Many open-source models are compatible with Warp through API providers like OpenRouter, Together AI, and Groq. Check our rankings to see which open-source models perform best.

Rankings refresh hourly. We monitor benchmark scores, pricing changes, and new model releases to keep recommendations current.