SWE-Bench MultilingualLike SWE-Bench, but the coding problems span many programming languages, not just one. Tests how broadly the AI can code. Higher is better.
79.8%+6.1%vs Composer 2
Agentic coding
CursorBench v3.1Cursor's own test of harder, real-world coding tasks inside a code editor. Higher is better.
63.2%+11%vs Composer 2
Agentic terminal coding
Terminal-Bench 2.0Can the AI work in a command-line terminal — running commands and finishing technical setup tasks the way a developer would? (Version 2.0 of the test.) Higher is better.
69.3%+7.6%vs Composer 2
About Composer 2.5
Composer 2.5 is an AI model released by Cursor on May 18 2026, 59 days after Composer 2. Benchmark results (shown above) cover SWE-Bench Multilingual, CursorBench v3.1, and Terminal-Bench 2.0.