Compare open-source and proprietary models across performance, pricing, capabilities, and stability. Tracking 293 models to help you decide which approach best fits your needs.
| Metric | Open Source | Proprietary |
|---|---|---|
| Model Count | 178 | 115 |
| Avg Score | 44.6 | 56.1 |
| Median Score | 46.0 | 57.5 |
| Best Score | 68.6Qwen3 VL 30B A3B Thinking | 90.9GPT-5.4 Pro |
| Avg Cost ($/1M) | $2.09 | $10.64 |
| Free Models | 24 | 0 |
| Avg Context Window | 141K | 449K |
| Stable Models % | 100.0% | 100.0% |
| Fragile Models % | 0.0% | 0.0% |
| # | Model | Score |
|---|---|---|
| 1 | Qwen3 VL 30B A3B ThinkingAlibaba | 69 |
| 2 | Qwen3 VL 235B A22B ThinkingAlibaba | 69 |
| 3 | Gemini 3.1 Pro Preview Custom ToolsGoogle | 68 |
| 4 | Nemotron Nano 12B 2 VL (free)NVIDIA | 64 |
| 5 | Qwen3.5-FlashAlibaba | 62 |
| 6 | Qwen3.5-122B-A10BAlibaba | 61 |
| 7 | Qwen3.5 397B A17BAlibaba | 61 |
| 8 | Qwen3.5-35B-A3BAlibaba | 61 |
| 9 | Qwen3.5-27BAlibaba | 61 |
| 10 | Kimi K2.5Moonshot AI | 59 |
| 11 | Step 3.5 Flash (free)StepFun | 58 |
| 12 | Qwen3 VL 8B ThinkingAlibaba | 58 |
| 13 | Qwen3 235B A22B Thinking 2507Alibaba | 57 |
| 14 | gpt-oss-120b (free)OpenAI | 56 |
| 15 | gpt-oss-20b (free)OpenAI | 56 |
| 16 | Gemma 3 27B (free)Google | 56 |
| 17 | MiniMax M2.5MiniMax | 54 |
| 18 | Trinity Large Preview (free)arcee-ai | 54 |
| 19 | Trinity Mini (free)arcee-ai | 54 |
| 20 | MiniMax M2MiniMax | 54 |
| # | Model | Score |
|---|---|---|
| 1 | GPT-5.4 ProOpenAI | 91 |
| 2 | GPT-5.2 ProOpenAI | 90 |
| 3 | GPT-5 ProOpenAI | 90 |
| 4 | o3 ProOpenAI | 82 |
| 5 | Claude Opus 4.1Anthropic | 81 |
| 6 | o1-proOpenAI | 77 |
| 7 | Claude Opus 4Anthropic | 76 |
| 8 | o3 Deep ResearchOpenAI | 74 |
| 9 | Claude Opus 4.6Anthropic | 71 |
| 10 | Claude Opus 4.5Anthropic | 70 |
| 11 | GPT-5.4OpenAI | 70 |
| 12 | Claude Sonnet 4.5Anthropic | 69 |
| 13 | GPT-5.2OpenAI | 68 |
| 14 | Gemini 3.1 Pro PreviewGoogle | 68 |
| 15 | Gemini 3 Pro PreviewGoogle | 68 |
| 16 | Claude Sonnet 4.6Anthropic | 68 |
| 17 | GPT-5.1OpenAI | 67 |
| 18 | GPT-5.3-CodexOpenAI | 67 |
| 19 | GPT-5.2-CodexOpenAI | 67 |
| 20 | GPT-5OpenAI | 67 |
| Capability | Open Source | Proprietary |
|---|---|---|
| Vision | 50 (28.1%) | 75 (65.2%) |
| Function Calling | 122 (68.5%) | 95 (82.6%) |
| Streaming | 178 (100.0%) | 115 (100.0%) |
| JSON Mode | 132 (74.2%) | 94 (81.7%) |
| Reasoning | 67 (37.6%) | 61 (53.0%) |
| Web Search | 5 (2.8%) | 50 (43.5%) |
| Image Output | 0 (0.0%) | 0 (0.0%) |
Open Source leads in model count, free model availability, lower average pricing. With 24 free models, open-source offers the most accessible entry point for experimentation and prototyping.
Proprietary leads in average score, median score, context window size, top model performance, capability coverage. The top proprietary model (GPT-5.4 Pro) achieves a score of 91, setting the current performance ceiling.
Across 293 tracked models (178 open-source, 115 proprietary), the landscape continues to evolve rapidly. Open-source models excel for self-hosting, fine-tuning, and cost control, while proprietary models often lead in raw performance and managed API convenience.