Analyzes which AI capabilities are most and least common across 300 tracked models, and how each feature correlates with composite scores. Discover adoption trends, score premiums, and capability combinations.
| Capability | Models | Adoption % | Score Premium |
|---|---|---|---|
Streaming | 300 | 100.0% | +68.2 |
JSON Mode | 231 | 77.0% | +13.8 |
Function Calling | 223 | 74.3% | +16.1 |
Reasoning | 145 | 48.3% | +18.4 |
Vision | 133 | 44.3% | +13.2 |
Web Search | 56 | 18.7% | +15.4 |
Image Output | 0 | 0.0% | -68.2 |
| # | Combination | Models | Avg Score |
|---|---|---|---|
| 1 | JSON Mode + Streaming | 231 | 71.4 |
| 2 | Function Calling + Streaming | 223 | 72.4 |
| 3 | Function Calling + JSON Mode | 193 | 73.2 |
| 4 | Function Calling + JSON Mode + Streaming | 193 | 73.2 |
| 5 | Reasoning + Streaming | 145 | 77.7 |
| 6 | Streaming + Vision | 133 | 75.5 |
| 7 | Function Calling + Reasoning | 125 | 79.5 |
| 8 | Function Calling + Reasoning + Streaming | 125 | 79.5 |
| 9 | JSON Mode + Reasoning | 119 | 79.7 |
| 10 | JSON Mode + Reasoning + Streaming | 119 | 79.7 |
No models currently have all 7 capabilities (vision, function calling, streaming, JSON mode, reasoning, web search, and image output). A “Full Stack” model would be one that supports every tracked feature.
| Provider | Vision | Function Calling | Streaming | JSON Mode | Reasoning | Web Search | Image Output |
|---|---|---|---|---|---|---|---|
| OpenAI(60) | 40/6067% | 53/6088% | 60/60100% | 58/6097% | 32/6053% | 29/6048% | 00% |
| Alibaba(51) | 19/5137% | 45/5188% | 51/51100% | 47/5192% | 24/5147% | 00% | 00% |
| Mistral AI(25) | 12/2548% | 23/2592% | 25/25100% | 23/2592% | 1/254% | 00% | 00% |
| Google(23) | 19/2383% | 13/2357% | 23/23100% | 21/2391% | 11/2348% | 00% | 00% |
| Meta(14) | 4/1429% | 6/1443% | 14/14100% | 8/1457% | 00% | 00% | 00% |
| Anthropic(13) | 13/13100% | 13/13100% | 13/13100% | 6/1346% | 10/1377% | 11/1385% | 00% |
| NVIDIA(11) | 2/1118% | 9/1182% | 11/11100% | 9/1182% | 10/1191% | 00% | 00% |
| DeepSeek(11) | 00% | 8/1173% | 11/11100% | 10/1191% | 10/1191% | 00% | 00% |
| xAI(10) | 5/1050% | 9/1090% | 10/10100% | 10/10100% | 8/1080% | 10/10100% | 00% |
| MiniMax(8) | 1/813% | 6/875% | 8/8100% | 5/863% | 6/875% | 00% | 00% |
| arcee-ai(7) | 1/714% | 4/757% | 7/7100% | 3/743% | 2/729% | 1/714% | 00% |
| ByteDance(5) | 5/5100% | 4/580% | 5/5100% | 4/580% | 4/580% | 00% | 00% |
| Liquid AI(5) | 00% | 00% | 5/5100% | 00% | 1/520% | 00% | 00% |
| Amazon(5) | 4/580% | 5/5100% | 5/5100% | 00% | 1/520% | 00% | 00% |
| Perplexity(5) | 4/580% | 00% | 5/5100% | 1/520% | 3/560% | 5/5100% | 00% |
| Baidu(5) | 2/540% | 2/540% | 5/5100% | 1/520% | 3/560% | 00% | 00% |
| Moonshot AI(4) | 1/425% | 4/4100% | 4/4100% | 4/4100% | 2/450% | 00% | 00% |
| Allen AI(4) | 00% | 1/425% | 4/4100% | 3/475% | 2/450% | 00% | 00% |
| Cohere(4) | 00% | 2/450% | 4/4100% | 4/4100% | 00% | 00% | 00% |
| Xiaomi(3) | 1/333% | 3/3100% | 3/3100% | 3/3100% | 3/3100% | 00% | 00% |
| Inception(3) | 00% | 3/3100% | 3/3100% | 3/3100% | 1/333% | 00% | 00% |
| aion-labs(3) | 00% | 00% | 3/3100% | 00% | 3/3100% | 00% | 00% |
We track seven key capabilities: Vision (image understanding), Function Calling (tool use), Streaming (real-time output), JSON Mode (structured output), Reasoning (chain-of-thought), Web Search (live information retrieval), and Image Output (image generation).
Score premium measures how much higher (or lower) the average composite score is for models that have a specific capability compared to those that lack it. A positive premium means models with that capability tend to score higher overall.
A full stack model supports all seven tracked capabilities: vision, function calling, streaming, JSON mode, reasoning, web search, and image output. These are the most versatile models available.
The most and least common capabilities are shown in the Overview section above. Adoption rates vary widely, with some capabilities like streaming being near-universal while others like image output are much rarer.