#1 79.0%
Claude Opus 4.6 Thinking
Proprietary
💻 Coding 78.4%*
🧠 Reasoning 78.1%*
🤖 Agents & Tools —
| Favorite | Rank | Model | Type | 💻 Coding | 🧠 Reasoning | 🤖 Agents & Tools | 💬 Conversation | 🔢 Math | 👁️ Multimodal | 🧠 Knowledge | Price | Speed |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
#1 79.0% | Claude Opus 4.6 Thinking | Proprietary | 78.4% * | 78.1% * | — | 79.9% * | — | — | 87.4% * | $15.00 | 67.8 t/s | |
#3 77.7% | GPT 5.4 High | Proprietary | 78.1% * | 81.6% * | 80.2% * | 64.7% * | 87.6% * | 64.6% * | 77.9% * | $8.75 | 75.3 t/s | |
#8 73.2% | Claude Opus 4.5 Thinking | Proprietary | 78.0% * | 66.7% | 76.6% | 72.5% | 81.9% | 58.1% | 77.0% | $15.00 | 35 t/s | |
#6 75.6% | GPT 5.2 Pro | Proprietary | 77.8% * | 74.8% * | 77.6% | 64.8% * | 87.9% * | 66.3% * | 77.1% * | $94.50 | 28 t/s | |
#2 77.8% | Gemini 3.1 Pro Preview | Proprietary | 77.5% * | 81.9% * | — | — | — | 59.8% * | 89.0% * | $7.00 | 130 t/s | |
#— — | GLM-5 | Open Source | 76.7% * | — | — | — | — | — | — * | $2.10 | 77.2 t/s | |
#12 69.7% | Claude Opus 4.5 | Proprietary | 76.6% * | 63.8% | 68.9% | 69.4% | 78.8% | 54.6% | 72.7% * | $15.00 | 65 t/s | |
#18 66.8% | Claude Sonnet 4.5 Thinking | Proprietary | 76.5% * | 57.5% | 64.8% | 68.7% | 76.5% | 51.7% * | 70.7% | $9.00 | 45 t/s | |
#9 73.0% | Gemini 3 Flash Thinking | Proprietary | 76.5% * | 67.2% | 77.0% * | 72.4% | 77.0% * | 61.9% * | 83.4% * | $1.75 | 180 t/s | |
#7 73.4% | GPT 5.2 High | Proprietary | 76.1% * | 71.0% | 76.0% | 62.5% | 87.2% | 64.1% | 75.0% | $7.88 | 45 t/s | |
#14 68.9% | Grok 4.1 Thinking | Proprietary | 75.6% * | 62.0% * | 58.0% * | 68.0% * | 79.8% * | 80.4% * | 77.1% * | $0.35 | 45 t/s | |
#5 75.7% | Claude Opus 4.6 | Proprietary | 75.2% * | 73.6% * | — * | 78.2% * | — * | — | 87.7% * | $15.00 | 67.8 t/s | |
#10 72.8% | Gemini 3 Pro | Proprietary | 75.2% * | 67.4% | 70.2% | 75.4% | 83.8% | 64.5% | 85.0% | $7.00 | 128 t/s | |
#16 68.2% | Gemini 3 Flash | Proprietary | 75.1% * | 54.0% * | 76.6% * | 71.5% | 67.6% * | 61.7% * | 83.3% | $1.75 | 218 t/s | |
#4 75.9% | Gemini 3.1 Pro Preview Base | Proprietary | 75.0% * | 80.2% * | — | — | — * | 58.6% * | 87.3% * | $7.00 | 130 t/s | |
#11 71.1% | GPT 5.2 | Proprietary | 75.0% * | 68.6% * | 68.0% * | 68.8% * | 82.2% * | 60.2% * | 75.5% * | $7.88 | 187 t/s | |
#22 64.5% | Claude Sonnet 4.5 | Proprietary | 74.1% * | 53.4% * | 63.4% | 64.3% | 73.3% | 57.6% | 70.3% | $9.00 | 77 t/s | |
#15 68.6% | GPT 5.1 High | Proprietary | 73.2% * | 60.8% | 67.6% * | 68.3% | 82.8% | 59.8% | 75.3% | $67.50 | 40 t/s | |
#13 69.4% | Kimi K2.5 Thinking | Open Source | 73.2% | 58.5% | — | — | 83.9% * | — | 80.3% * | $1.55 | 45 t/s | |
#— — | MiniMax M2.5 | Open Source | 72.6% * | 48.3% * | 76.7% * | — | — | — | — * | $0.75 | 39.3 t/s | |
#17 68.1% | Kimi K2.5 Instant | Open Source | 72.2% * | 57.5% * | — * | — * | 81.5% * | — * | 78.6% * | $1.55 | 85 t/s | |
#27 61.2% | Claude Opus 4.1 | Proprietary | 71.9% | 48.2% * | 63.4% | 64.0% | 63.1% | 54.6% * | 65.0% * | $45.00 | 52 t/s | |
#19 66.6% | Grok 4.1 | Proprietary | 71.7% * | 57.6% * | 60.0% * | 66.4% * | 78.8% * | 76.4% | 75.7% * | $0.35 | 95 t/s | |
#— — | GLM-4.7 | Open Source | 71.5% * | — | — | — | — | — | — | $1.07 | 92 t/s | |
#23 63.4% | GPT 5.1 | Proprietary | 71.5% * | 48.0% * | 64.5% * | 69.8% * | 78.2% * | 49.4% * | 75.5% * | $3.75 | 120 t/s | |
#20 66.4% | Kimi K2 Thinking | Open Source | 70.7% * | 50.7% * | 77.4% * | 61.9% * | 78.7% * | — | 72.8% * | $1.55 | 45 t/s | |
#25 61.6% | OpenAI o3 | Proprietary | 69.3% * | 51.7% * | 57.2% * | 58.4% * | 78.6% * | 56.1% | 75.7% * | $25.00 | 35 t/s | |
#— — | MiniMax M2.1 | Open Source | 68.6% * | 47.2% * | 75.1% * | — | — | — | — | $0.75 | 148 t/s | |
#21 64.6% | o4-mini | Proprietary | 68.0% * | 49.3% * | — | — | 83.2% * | 82.9% * | 59.1% * | $10.00 | 100 t/s | |
#— — | GLM-4.6 | Open Source | 66.8% * | — | — | — | — * | — | — | $1.39 | 104.6 t/s | |
#31 59.3% | Qwen3 Max Preview | Proprietary | 65.9% * | 36.1% | 66.1% * | 62.3% * | 75.6% * | 67.0% * | 73.8% * | $3.60 | 85 t/s | |
#26 61.3% | Gemini 2.5 Pro | Proprietary | 65.3% | 54.5% | 52.7% | 65.1% | 76.4% | 57.4% | 78.2% | $3.13 | 165 t/s | |
#28 61.2% | DeepSeek V3.2 Thinking | Open Source | 65.1% | 53.1% | 55.0% | 60.5% | 73.0% | 72.1% | 69.6% | $0.35 | 60 t/s | |
#36 57.5% | MiniMax M2 | Open Source | 64.1% * | 58.6% * | — | 42.2% * | — | — | 53.7% * | $0.75 | 100 t/s | |
#34 58.8% | DeepSeek V3.2 | Open Source | 63.5% | 48.0% | 55.1% | 54.9% | 73.1% | 72.5% | 68.2% | $0.35 | 120 t/s | |
#32 59.0% | Kimi K2 | Open Source | 63.4% * | 45.7% * | 66.7% * | 56.5% * | 75.3% * | — | 43.4% * | $1.55 | 85 t/s | |
#33 59.0% | Qwen3 235B | Open Source | 62.9% * | 54.5% * | 54.4% * | 59.5% * | 72.5% * | 46.9% * | 72.3% * | Free | 75 t/s | |
#30 59.6% | OpenAI o3-mini | Proprietary | 59.4% * | 53.5% | 58.6% * | — | 78.8% | — | 52.2% * | $2.75 | 115 t/s | |
#35 58.3% | DeepSeek R1 | Open Source | 57.9% | 48.4% | 54.5% * | 60.8% | 76.8% | 71.0% | 67.5% | $1.37 | 85 t/s | |
#37 55.2% | Longcat Flash Chat | Open Source | 57.8% * | 35.8% | 65.6% * | 57.8% * | 80.3% * | — | 39.8% * | $0.45 | 100 t/s | |
#41 47.5% | Mistral Large 3 | Open Source | 53.8% * | 24.2% | — | 57.1% * | 74.5% * | — | 61.9% * | $1.00 | 90 t/s | |
#24 62.9% | Claude Sonnet 4.6 Thinking | Proprietary | 53.6% * | 72.5% * | — | — | 59.3% | 58.0% * | 84.1% * | $9.00 | 45 t/s | |
#29 59.7% | Claude Sonnet 4.6 | Proprietary | 52.5% * | 65.4% * | — * | — | 58.1% * | 56.8% * | 83.2% * | $9.00 | 77 t/s | |
#40 50.6% | Qwen3 32B | Open Source | 52.0% * | 45.8% * | 47.3% * | 44.9% * | 67.6% * | 56.3% | 52.2% * | Free | 145 t/s | |
#38 52.2% | Gemini 2.5 Flash | Proprietary | 51.8% * | 43.5% * | 50.1% * | 60.0% * | 70.3% * | 45.7% * | 63.6% * | $0.38 | 372 t/s | |
#42 46.6% | Llama 4 Maverick | Open Source | 51.4% * | 41.4% * | 49.0% * | 41.8% * | 45.8% * | 45.4% | 60.0% * | Free | 155 t/s | |
#39 51.6% | GPT-4.5 | Proprietary | 45.8% * | 43.5% * | 49.4% * | 67.4% * | 65.7% * | 50.4% * | 72.7% * | $7.50 | 85 t/s | |
#44 39.7% | Llama 4 Scout | Open Source | 40.9% * | 36.2% * | 41.5% * | 38.7% * | 38.0% * | 40.7% | 55.0% * | Free | 2.6k t/s | |
#43 41.1% | GPT-4o | Proprietary | 39.1% * | 33.5% * | 47.8% * | 45.8% * | 44.4% * | 39.0% * | 56.0% * | $6.25 | 110 t/s | |
#— — | Grok 4.20 Thinking | Proprietary | 36.2% | — | — | — | — | — | — | $4.00 | 100 t/s | |
#— — | MiMo v2 Pro | Proprietary | — | — | — | — | — | — | — | $2.00 | 94.5 t/s | |
#— — | MiniMax M2.7 | Open Source | — | — | — | — | — | — | — | $0.75 | 53.6 t/s | |
#— — | Qwen 3.5 Plus | Proprietary | — | — | — | — | — | — | — | $1.36 | 85.3 t/s | |
#— — | Grok 4.20 | Proprietary | — | — | — | — | — | — | — | $4.00 | 100 t/s |
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Open
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Open
Open
Open
Proprietary
Proprietary
Open
Proprietary
Open
Proprietary
Open
Proprietary
Open
Proprietary
Proprietary
Open
Open
Open
Open
Open
Proprietary
Open
Open
Open
Proprietary
Proprietary
Open
Proprietary
Open
Proprietary
Open
Proprietary
Proprietary
Proprietary
Open
Proprietary
Proprietary