Best AI Models for Data Analysis

AI models ranked by data analysis capability using MMLU, MATH-500, and GPQA benchmark scores. Find the top LLM for data science, analytics, and insights.

Last updated: just now

#1 Model

Score: 93.8

Average Score

Across all ranked models

Models Ranked

With benchmark data

Weights:MMLU (40%)MATH-500 (30%)GPQA (30%)

#	Model	Provider	Score	MMLU	MATH-500	GPQA
1	o1OpenAI	OpenAI	93.8	91.8	96.4	--
2	R1DeepSeek	DeepSeek	93.6	90.8	97.3	--
3	Gemini 2.5 ProGoogle	Google	92.7	90.8	95.2	--
4	o3 MiniOpenAI	OpenAI	91.6	86.9	97.9	--
5	Claude Opus 4.5Anthropic	Anthropic	90	91.4	88.1	--
6	DeepSeek V3DeepSeek	DeepSeek	89.2	88.5	90.2	--
7	Claude 3.7 SonnetAnthropic	Anthropic	86.8	90.2	82.2	--
8	Claude 3.5 SonnetAnthropic	Anthropic	84.2	88.7	78.3	--
9	GPT-4oOpenAI	OpenAI	83.5	88.7	76.6	--
10	Llama 3.3 70B InstructMeta	Meta	82.3	86.3	77	--
11	Gemini 2.0 FlashGoogle	Google	82.1	76.4	89.7	--
12	Mistral LargeMistral AI	Mistral AI	80.6	84	76	--
13	GPT-4 TurboOpenAI	OpenAI	80.5	86.5	72.6	--
14	Llama 3.1 70B InstructMeta	Meta	78.3	86	68	--
15	GPT-4o-miniOpenAI	OpenAI	76.9	82	70.2	--
16	Claude 3.5 HaikuAnthropic	Anthropic	75.9	80.9	69.2	--
17	Phi 4Microsoft	Microsoft	20.8	--	--	20.8
18	Qwen2.5 72B InstructAlibaba	Alibaba	16.7	--	--	16.7
19	Qwen2.5 Coder 32B InstructAlibaba	Alibaba	13.2	--	--	13.2
20	Gemma 2 9BGoogle	Google	9.7	--	--	9.7
21	Command R7B (12-2024)Cohere	Cohere	7.8	--	--	7.8
22	Llama 3.1 8B InstructMeta	Meta	7.4	--	--	7.4
23	Llama 3.2 3B InstructMeta	Meta	6.2	--	--	6.2
24	Qwen2.5 Coder 7B InstructAlibaba	Alibaba	5.8	--	--	5.8
25	Qwen2.5 7B InstructAlibaba	Alibaba	5.5	--	--	5.5
26	Llama 3 8B InstructMeta	Meta	2.1	--	--	2.1
27	QwQ 32BAlibaba	Alibaba	1.3	--	--	1.3

How scores are calculated

Each model's score is a weighted average of its available benchmark results. When a model is missing some benchmarks, the weights are re-normalized across the benchmarks that are available. All scores are on a 0-100 scale. Data sourced from official model cards, published papers, and third-party evaluation platforms.