AI Model Release Tracker - Timeline of Major AI Models from 2022-2026

Claude Opus 4.7 vs Gemini 3.5 Flash

Anthropic
Claude Opus 4.7
Google
Gemini 3.5 Flash
Overview
CompanyAnthropicGoogle
Release dateApr 16 2026May 19 2026
AccessProprietaryProprietary
Specifications
Context window
1M
Benchmarks
Nonsense detection
BullshitBench v2
83%Best
20%
Agentic coding
SWE-Bench Pro
64.3%Best
55.1%
Coding
SWE-Bench Verified
87.6%
Multilingual coding
SWE-Bench Multilingual
80.5%
Agentic coding
CursorBench v3.1
64.8%Best
49.8%
Next.js coding
Next.js Evals
75%
Agentic terminal coding
Terminal-Bench 2.1
66.1%
76.2%Best
Agentic terminal coding
Terminal-Bench 2.0
69.4%
Multi-step tool use
MCP Atlas
79.1%
83.6%Best
General tool use
Toolathlon
56.5%
Web browsing
BrowseComp
79.3%
Cybersecurity
CyberGym
73.1%
Multidisciplinary reasoning
Humanity's Last Exam · no tools
46.9%Best
40.2%
Multidisciplinary reasoning
Humanity's Last Exam · with tools
54.7%
Abstract reasoning
ARC-AGI-2
75.8%Best
72.1%
Advanced math
FrontierMath · Tier 1–3
43.8%
Advanced math
FrontierMath · Tier 4
22.9%
Science
GPQA Diamond
94.2%
Agentic computer use
OSWorld-Verified
78%
78.4%Best
Agentic financial analysis
Finance Agent v2
51.5%
57.9%Best
Knowledge work
GDPval-AA
1753Best
1656
Knowledge work
GDPval (win/tie rate)
80.3%
Chart reasoning
CharXiv Reasoning
82.1%
84.2%Best
Multimodal reasoning
MMMU-Pro
75.2%
83.6%Best
Spatial reasoning
Blueprint-Bench 2
24.5%
33.6%Best
Long context
MRCR v2 (8-needle) · 128k average
59.3%
77.3%Best
Long context
MRCR v2 (8-needle) · 1M pointwise
26.6%
Timeline
Release gapClaude Opus 4.7 shipped 33 days before Gemini 3.5 Flash

Which is better: Claude Opus 4.7 or Gemini 3.5 Flash?

Gemini 3.5 Flash leads Claude Opus 4.7 on 8 of the 14 benchmarks they both report. Claude Opus 4.7 shipped 33 days before Gemini 3.5 Flash, so benchmark comparisons should account for the intervening progress.

Published specifications for these two models are limited — see each model page for the latest details.

On BullshitBench v2, Claude Opus 4.7 leads at 83% vs Gemini 3.5 Flash at 20%. On SWE-Bench Pro, Claude Opus 4.7 leads at 64.3% vs Gemini 3.5 Flash at 55.1%. On CursorBench v3.1, Claude Opus 4.7 leads at 64.8% vs Gemini 3.5 Flash at 49.8%. On Terminal-Bench 2.1, Gemini 3.5 Flash leads at 76.2% vs Claude Opus 4.7 at 66.1%. On MCP Atlas, Gemini 3.5 Flash leads at 83.6% vs Claude Opus 4.7 at 79.1%. On Humanity's Last Exam · no tools, Claude Opus 4.7 leads at 46.9% vs Gemini 3.5 Flash at 40.2%. On ARC-AGI-2, Claude Opus 4.7 leads at 75.8% vs Gemini 3.5 Flash at 72.1%. On OSWorld-Verified, Gemini 3.5 Flash leads at 78.4% vs Claude Opus 4.7 at 78%. On Finance Agent v2, Gemini 3.5 Flash leads at 57.9% vs Claude Opus 4.7 at 51.5%. On GDPval-AA, Claude Opus 4.7 leads at 1753 vs Gemini 3.5 Flash at 1656. On CharXiv Reasoning, Gemini 3.5 Flash leads at 84.2% vs Claude Opus 4.7 at 82.1%. On MMMU-Pro, Gemini 3.5 Flash leads at 83.6% vs Claude Opus 4.7 at 75.2%. On Blueprint-Bench 2, Gemini 3.5 Flash leads at 33.6% vs Claude Opus 4.7 at 24.5%. On MRCR v2 (8-needle) · 128k average, Gemini 3.5 Flash leads at 77.3% vs Claude Opus 4.7 at 59.3%.

Frequently asked questions

Claude Opus 4.7 was released by Anthropic on Apr 16 2026.

Gemini 3.5 Flash was released by Google on May 19 2026.

Claude Opus 4.7 leads on SWE-Bench Pro — Claude Opus 4.7 64.3% vs Gemini 3.5 Flash 55.1%.

Claude Opus 4.7 leads on Humanity's Last Exam · no tools — Claude Opus 4.7 46.9% vs Gemini 3.5 Flash 40.2%.

Other comparisons