1 vs 1 benchmarks won
Anthropic Claude Haiku 4.5 | Moonshot AI Kimi K2 (0905) | |
|---|---|---|
| Overview | ||
| Company | Anthropic | Moonshot AI |
| Release date | Oct 15 2025 | Sep 5 2025 |
| Model type | — | — |
| Open source | No | Yes |
| Specifications | ||
Parameters | — | 1T |
Context window | — | 256k |
| Benchmarks | ||
Science reasoning GPQA Diamond | 73% | 75.1% |
Software engineering SWE-Bench Verified | 73.3% | 65.8% |
Multimodal understanding MMMU | — | — |
| Timeline | ||
| Release gap | Kimi K2 (0905) shipped 40 days before Claude Haiku 4.5 | |
Claude Haiku 4.5 and Kimi K2 (0905) are evenly matched across the benchmarks they both publish. Kimi K2 (0905) shipped 40 days before Claude Haiku 4.5, so benchmark comparisons should account for the intervening progress.
Kimi K2 (0905) is an open-source / open-weight model; Claude Haiku 4.5 is proprietary.
On GPQA Diamond, Kimi K2 (0905) scores 75.1%, 2.1 points above Claude Haiku 4.5 at 73%. On SWE-Bench Verified, Claude Haiku 4.5 scores 73.3%, 7.5 points above Kimi K2 (0905) at 65.8%.