AI Model Release Tracker - Timeline of Major AI Models from 2022-2026

Claude Sonnet 4.6 vs Gemini 3.5 Flash

Anthropic
Claude Sonnet 4.6
Google
Gemini 3.5 Flash
Overview
CompanyAnthropicGoogle
Release dateFeb 17 2026May 19 2026
AccessProprietaryProprietary
Benchmarks
Nonsense detection
BullshitBench v2
91%Best
20%
Agentic coding
SWE-Bench Pro
55.1%
Coding
SWE-Bench Verified
79.6%
Agentic coding
CursorBench v3.1
49%
49.8%Best
Next.js coding
Next.js Evals
58%
Agentic terminal coding
Terminal-Bench 2.1
76.2%
Multi-step tool use
MCP Atlas
69.5%
83.6%Best
General tool use
Toolathlon
56.5%
Multidisciplinary reasoning
Humanity's Last Exam · no tools
33.2%
40.2%Best
Abstract reasoning
ARC-AGI-2
58.3%
72.1%Best
Science
GPQA Diamond
89.9%
Agentic computer use
OSWorld-Verified
72.5%
78.4%Best
Agentic financial analysis
Finance Agent v2
51%
57.9%Best
Knowledge work
GDPval-AA
1676Best
1656
Chart reasoning
CharXiv Reasoning
72.4%
84.2%Best
Multimodal reasoning
MMMU-Pro
74.5%
83.6%Best
Spatial reasoning
Blueprint-Bench 2
6.7%
33.6%Best
Long context
MRCR v2 (8-needle) · 128k average
84.9%Best
77.3%
Long context
MRCR v2 (8-needle) · 1M pointwise
26.6%
Timeline
Release gapClaude Sonnet 4.6 shipped 91 days before Gemini 3.5 Flash

Which is better: Claude Sonnet 4.6 or Gemini 3.5 Flash?

Gemini 3.5 Flash leads Claude Sonnet 4.6 on 9 of the 12 benchmarks they both report. Claude Sonnet 4.6 shipped 91 days before Gemini 3.5 Flash, so benchmark comparisons should account for the intervening progress.

Published specifications for these two models are limited — see each model page for the latest details.

On BullshitBench v2, Claude Sonnet 4.6 leads at 91% vs Gemini 3.5 Flash at 20%. On CursorBench v3.1, Gemini 3.5 Flash leads at 49.8% vs Claude Sonnet 4.6 at 49%. On MCP Atlas, Gemini 3.5 Flash leads at 83.6% vs Claude Sonnet 4.6 at 69.5%. On Humanity's Last Exam · no tools, Gemini 3.5 Flash leads at 40.2% vs Claude Sonnet 4.6 at 33.2%. On ARC-AGI-2, Gemini 3.5 Flash leads at 72.1% vs Claude Sonnet 4.6 at 58.3%. On OSWorld-Verified, Gemini 3.5 Flash leads at 78.4% vs Claude Sonnet 4.6 at 72.5%. On Finance Agent v2, Gemini 3.5 Flash leads at 57.9% vs Claude Sonnet 4.6 at 51%. On GDPval-AA, Claude Sonnet 4.6 leads at 1676 vs Gemini 3.5 Flash at 1656. On CharXiv Reasoning, Gemini 3.5 Flash leads at 84.2% vs Claude Sonnet 4.6 at 72.4%. On MMMU-Pro, Gemini 3.5 Flash leads at 83.6% vs Claude Sonnet 4.6 at 74.5%. On Blueprint-Bench 2, Gemini 3.5 Flash leads at 33.6% vs Claude Sonnet 4.6 at 6.7%. On MRCR v2 (8-needle) · 128k average, Claude Sonnet 4.6 leads at 84.9% vs Gemini 3.5 Flash at 77.3%.

Frequently asked questions

Claude Sonnet 4.6 was released by Anthropic on Feb 17 2026.

Gemini 3.5 Flash was released by Google on May 19 2026.

Gemini 3.5 Flash leads on CursorBench v3.1 — Claude Sonnet 4.6 49% vs Gemini 3.5 Flash 49.8%.

Gemini 3.5 Flash leads on Humanity's Last Exam · no tools — Claude Sonnet 4.6 33.2% vs Gemini 3.5 Flash 40.2%.

Other comparisons