1 vs 1 benchmarks won
Anthropic Claude Opus 4.5 | Google Gemini 3.0 Flash | |
|---|---|---|
| Overview | ||
| Company | Anthropic | |
| Release date | Nov 24 2025 | Dec 17 2025 |
| Model type | — | — |
| Open source | No | No |
| Specifications | ||
Parameters | — | — |
Context window | — | — |
| Benchmarks | ||
Science reasoning GPQA Diamond | 87% | 90.4% |
Software engineering SWE-Bench Verified | 80.9% | 78% |
Multimodal understanding MMMU | — | — |
| Timeline | ||
| Release gap | Claude Opus 4.5 shipped 23 days before Gemini 3.0 Flash | |
Claude Opus 4.5 and Gemini 3.0 Flash are evenly matched across the benchmarks they both publish. Claude Opus 4.5 shipped 23 days before Gemini 3.0 Flash, so benchmark comparisons should account for the intervening progress.
Published specifications for these two models are limited — see each model page for the latest details.
On GPQA Diamond, Gemini 3.0 Flash scores 90.4%, 3.4 points above Claude Opus 4.5 at 87%. On SWE-Bench Verified, Claude Opus 4.5 scores 80.9%, 2.9 points above Gemini 3.0 Flash at 78%.