Be first to know when a new model drops.Get instant alerts · $4/mo

Home Latest Analytics Pricing Contact

Agentic coding

SWE-Bench Pro

Can the AI fix real bugs in real software? It's handed actual problems from open-source projects and has to write code that genuinely solves them. Higher is better.

Rankings

Higher is better

Anthropic · Jun 9 2026

Claude Opus 4.8

Anthropic · May 28 2026

Claude Opus 4.7

Anthropic · Apr 16 2026

OpenAI · Apr 23 2026

Gemini 3.5 Flash

Google · May 19 2026

Google · Feb 19 2026

Gemini 3.0 Flash

Google · Dec 17 2025

← All benchmarks