1 vs 1 benchmarks won
Anthropic Claude Opus 4.6 | Google Gemini 3.1 Pro | |
|---|---|---|
| Overview | ||
| Company | Anthropic | |
| Release date | Feb 5 2026 | Feb 19 2026 |
| Model type | — | — |
| Open source | No | No |
| Specifications | ||
Parameters | — | — |
Context window | — | — |
| Benchmarks | ||
Science reasoning GPQA Diamond | 91.3% | 94.3% |
Software engineering SWE-Bench Verified | 80.8% | 80.6% |
Multimodal understanding MMMU | — | — |
| Timeline | ||
| Release gap | Claude Opus 4.6 shipped 14 days before Gemini 3.1 Pro | |
Claude Opus 4.6 and Gemini 3.1 Pro are evenly matched across the benchmarks they both publish. Claude Opus 4.6 shipped 14 days before Gemini 3.1 Pro, so benchmark comparisons should account for the intervening progress.
Published specifications for these two models are limited — see each model page for the latest details.
On GPQA Diamond, Gemini 3.1 Pro scores 94.3%, 3 points above Claude Opus 4.6 at 91.3%. On SWE-Bench Verified, Claude Opus 4.6 scores 80.8%, 0.2 points above Gemini 3.1 Pro at 80.6%.