AI Model Release Tracker - Timeline of Major AI Models from 2022-2026

Home/Compare/Claude Opus 4.8 vs Gemini 3.1 Pro

Claude Opus 4.8 vs Gemini 3.1 Pro

7 vs 0 benchmarks won

Anthropic
Claude Opus 4.8
Google
Gemini 3.1 Pro
Overview
CompanyAnthropicGoogle
Release dateMay 28 2026Feb 19 2026
Model type
Open sourceNoNo
Specifications
Parameters
Context window
Benchmarks
Agentic coding
SWE-Bench Pro
69.2%Best
54.2%
Coding
SWE-Bench Verified
80.6%
Agentic terminal coding
Terminal-Bench 2.1
74.6%Best
70.3%
Agentic terminal coding
Terminal-Bench 2.0
68.5%
Multi-step tool use
MCP Atlas
78.2%
General tool use
Toolathlon
48.8%
Web browsing
BrowseComp
85.9%
Multidisciplinary reasoning
Humanity's Last Exam · no tools
49.8%Best
44.4%
Multidisciplinary reasoning
Humanity's Last Exam · with tools
57.9%Best
51.4%
Abstract reasoning
ARC-AGI-2
77.1%
Advanced math
FrontierMath · Tier 1–3
36.9%
Advanced math
FrontierMath · Tier 4
16.7%
Science
GPQA Diamond
94.3%
Agentic computer use
OSWorld-Verified
83.4%Best
76.2%
Agentic financial analysis
Finance Agent v2
53.9%Best
43%
Knowledge work
GDPval-AA
1890Best
1314
Knowledge work
GDPval (win/tie rate)
67.3%
Chart reasoning
CharXiv Reasoning
83.3%
Multimodal reasoning
MMMU-Pro
80.5%
Spatial reasoning
Blueprint-Bench 2
26.5%
Long context
MRCR v2 (8-needle) · 128k average
84.9%
Long context
MRCR v2 (8-needle) · 1M pointwise
26.3%
Timeline
Release gapGemini 3.1 Pro shipped 98 days before Claude Opus 4.8

Which is better: Claude Opus 4.8 or Gemini 3.1 Pro?

Claude Opus 4.8 leads Gemini 3.1 Pro on 7 of the 7 benchmarks they both report. Gemini 3.1 Pro shipped 98 days before Claude Opus 4.8, so benchmark comparisons should account for the intervening progress.

Published specifications for these two models are limited — see each model page for the latest details.

On SWE-Bench Pro, Claude Opus 4.8 leads at 69.2% vs Gemini 3.1 Pro at 54.2%. On Terminal-Bench 2.1, Claude Opus 4.8 leads at 74.6% vs Gemini 3.1 Pro at 70.3%. On Humanity's Last Exam · no tools, Claude Opus 4.8 leads at 49.8% vs Gemini 3.1 Pro at 44.4%. On Humanity's Last Exam · with tools, Claude Opus 4.8 leads at 57.9% vs Gemini 3.1 Pro at 51.4%. On OSWorld-Verified, Claude Opus 4.8 leads at 83.4% vs Gemini 3.1 Pro at 76.2%. On Finance Agent v2, Claude Opus 4.8 leads at 53.9% vs Gemini 3.1 Pro at 43%. On GDPval-AA, Claude Opus 4.8 leads at 1890 vs Gemini 3.1 Pro at 1314.

Frequently asked questions

When was Claude Opus 4.8 released?
Claude Opus 4.8 was released by Anthropic on May 28 2026.
When was Gemini 3.1 Pro released?
Gemini 3.1 Pro was released by Google on Feb 19 2026.
Which is better at coding, Claude Opus 4.8 or Gemini 3.1 Pro?
Claude Opus 4.8 leads on SWE-Bench Pro — Claude Opus 4.8 69.2% vs Gemini 3.1 Pro 54.2%.
Which scores higher on Humanity's Last Exam, Claude Opus 4.8 or Gemini 3.1 Pro?
Claude Opus 4.8 leads on Humanity's Last Exam · no tools — Claude Opus 4.8 49.8% vs Gemini 3.1 Pro 44.4%.

Other comparisons