Long context
MRCR v2 (8-needle)1M pointwise
Tests whether the AI can find specific details buried inside an enormous document (around 1 million tokens — many books). Higher is better.
Tests whether the AI can find specific details buried inside an enormous document (around 1 million tokens — many books). Higher is better.