AI Models with Long Output

176 AI models with 8K+ output tokens per response. 157 models support 16K+ tokens and 122 support 32K+ — enough to generate full articles, complete code files, or detailed reports in a single response.

64K+ Output

122

32K+ Output

157

16K+ Output

Free

All Long Output Models — Sorted by Max Output

#	Model	Provider	Max Output	Score	Context	$/1M Out
1	MiniMax-01MiniMax	MiniMax	1.0M	45	1.0M	$1.10
2	Qwen3 30B A3B Instruct 2507Alibaba	Alibaba	262K	50	262K	$0.30
3	Qwen3 Coder 480B A35B (free)Alibaba	Alibaba	262K	54	262K	Free
4	Step 3.5 Flash (free)StepFun	StepFun	256K	58	256K	Free
5	Step 3.5 FlashStepFun	StepFun	256K	51	256K	$0.30
6	MiniMax M2.5MiniMax	MiniMax	197K	54	197K	$1.20
7	MiniMax M2MiniMax	MiniMax	197K	54	197K	$1.00
8	DeepSeek V3.2 SpecialeDeepSeek	DeepSeek	164K	51	164K	$1.20
9	DeepSeek V3 0324DeepSeek	DeepSeek	164K	49	164K	$0.77
10	DeepSeek V3DeepSeek	DeepSeek	164K	43	164K	$0.89
11	Seed-2.0-MiniByteDance	ByteDance	131K	61	262K	$0.40
12	gpt-oss-120b (free)OpenAI	OpenAI	131K	56	131K	Free
13	gpt-oss-20b (free)OpenAI	OpenAI	131K	56	131K	Free
14	Trinity Miniarcee-ai	arcee-ai	131K	53	131K	$0.15
15	Tongyi DeepResearch 30B A3BAlibaba	Alibaba	131K	53	131K	$0.45
16	Mistral Small 3.2 24BMistral AI	Mistral AI	131K	53	131K	$0.18
17	LongCat Flash ChatMeituan	Meituan	131K	50	131K	$0.80
18	Hunyuan A13B InstructTencent	Tencent	131K	48	131K	$0.57
19	GPT-5.4 ProOpenAI	OpenAI	128K	91	1.1M	$180.00
20	GPT-5.2 ProOpenAI	OpenAI	128K	90	400K	$168.00
21	GPT-5 ProOpenAI	OpenAI	128K	90	400K	$120.00
22	Claude Opus 4.6Anthropic	Anthropic	128K	71	1M	$25.00
23	GPT-5.4OpenAI	OpenAI	128K	70	1.1M	$15.00
24	GPT-5.2OpenAI	OpenAI	128K	68	400K	$14.00
25	Claude Sonnet 4.6Anthropic	Anthropic	128K	68	1M	$15.00
26	GPT-5.1OpenAI	OpenAI	128K	67	400K	$10.00
27	GPT-5.3-CodexOpenAI	OpenAI	128K	67	400K	$14.00
28	GPT-5.2-CodexOpenAI	OpenAI	128K	67	400K	$14.00
29	GPT-5OpenAI	OpenAI	128K	67	400K	$10.00
30	GPT-5.1-Codex-MaxOpenAI	OpenAI	128K	66	400K	$10.00
31	GPT-5 MiniOpenAI	OpenAI	128K	65	400K	$2.00
32	GPT-5 NanoOpenAI	OpenAI	128K	64	400K	$0.40
33	Nemotron Nano 12B 2 VL (free)NVIDIA	NVIDIA	128K	64	128K	Free
34	GPT-5.1-CodexOpenAI	OpenAI	128K	62	400K	$10.00
35	GPT-5 CodexOpenAI	OpenAI	128K	62	400K	$10.00
36	KAT-Coder-Pro V1Kuaishou	Kuaishou	128K	51	256K	$0.83
37	Llama 3.3 70B Instruct (free)Meta	Meta	128K	46	128K	Free
38	o3 ProOpenAI	OpenAI	100K	82	200K	$80.00
39	o1-proOpenAI	OpenAI	100K	77	200K	$600.00
40	o3 Deep ResearchOpenAI	OpenAI	100K	74	200K	$40.00

Why Output Length Matters

Long-Form Writing

A 16K output limit produces ~12,000 words — enough for a full blog post or report chapter. Models with 32K+ can write entire research papers or documentation sets in one shot.

Code Generation

Generating complete files, modules, or refactoring large codebases requires high output limits. 8K tokens covers ~250 lines of code; 32K covers full application files.

Output vs Context

Context window is the total input+output capacity. Max output is how much the model can generate in one response. A 128K context model might only output 4K tokens per response.

Cost Implications

You pay per output token. Longer outputs cost more but may be more efficient than multiple short requests. Budget models under $1/1M make long outputs affordable at scale.

Large Context Best for Writing Creative Writing Best for Coding Full Leaderboard

Model

Max Output

MiniMax-01MiniMax

1.0M

Qwen3 30B A3B Instruct 2507Alibaba

262K

Qwen3 Coder 480B A35B (free)Alibaba

262K

Step 3.5 Flash (free)StepFun

256K

Step 3.5 FlashStepFun

256K

MiniMax M2.5MiniMax

197K

MiniMax M2MiniMax

197K

DeepSeek V3.2 SpecialeDeepSeek

164K

DeepSeek V3 0324DeepSeek

164K

DeepSeek V3DeepSeek

164K

Seed-2.0-MiniByteDance

131K

gpt-oss-120b (free)OpenAI

131K

gpt-oss-20b (free)OpenAI

131K

Trinity Miniarcee-ai

131K

Tongyi DeepResearch 30B A3BAlibaba

131K

Mistral Small 3.2 24BMistral AI

131K

LongCat Flash ChatMeituan

131K

Hunyuan A13B InstructTencent

131K

GPT-5.4 ProOpenAI

128K

GPT-5.2 ProOpenAI

128K

GPT-5 ProOpenAI

128K

Claude Opus 4.6Anthropic

128K

GPT-5.4OpenAI

128K

GPT-5.2OpenAI

128K

Claude Sonnet 4.6Anthropic

128K

GPT-5.1OpenAI

128K

GPT-5.3-CodexOpenAI

128K

GPT-5.2-CodexOpenAI

128K

GPT-5OpenAI

128K

GPT-5.1-Codex-MaxOpenAI

128K

GPT-5 MiniOpenAI

128K

GPT-5 NanoOpenAI

128K

Nemotron Nano 12B 2 VL (free)NVIDIA

128K

GPT-5.1-CodexOpenAI

128K

GPT-5 CodexOpenAI

128K

KAT-Coder-Pro V1Kuaishou

128K

Llama 3.3 70B Instruct (free)Meta

128K

o3 ProOpenAI

100K

o1-proOpenAI

100K

o3 Deep ResearchOpenAI

100K

Why Output Length Matters

Long-Form Writing

A 16K output limit produces ~12,000 words — enough for a full blog post or report chapter. Models with 32K+ can write entire research papers or documentation sets in one shot.

Code Generation

Generating complete files, modules, or refactoring large codebases requires high output limits. 8K tokens covers ~250 lines of code; 32K covers full application files.

Output vs Context

Context window is the total input+output capacity. Max output is how much the model can generate in one response. A 128K context model might only output 4K tokens per response.

Cost Implications

You pay per output token. Longer outputs cost more but may be more efficient than multiple short requests. Budget models under $1/1M make long outputs affordable at scale.

AI Models with Long Output

All Long Output Models — Sorted by Max Output

Why Output Length Matters

Long-Form Writing

Code Generation

Output vs Context

Cost Implications

Related Pages

AI Models with Long Output

All Long Output Models — Sorted by Max Output

Why Output Length Matters

Long-Form Writing

Code Generation

Output vs Context

Cost Implications

Related Pages