Be first to know when a new model drops.Get instant alerts · $4/mo

Home Latest Analytics Pricing Contact

Long context

MRCR v2 (8-needle)128k average

Tests whether the AI can find specific details buried inside a very long document (around 128k tokens — roughly a long book). Higher is better.

128k average 1M pointwise

Rankings

Higher is better

OpenAI · Apr 23 2026

Claude Sonnet 4.6

Anthropic · Feb 17 2026

Google · Feb 19 2026

Gemini 3.5 Flash

Google · May 19 2026

Gemini 3.0 Flash

Google · Dec 17 2025

Claude Opus 4.7

Anthropic · Apr 16 2026

← All benchmarks