Z.ai
GLM-4.7Open Weight
GLM-4.7 is an AI model released by Z.ai on Dec 22 2025, 83 days after GLM-4.6. It is an open-weight model — the trained weights are available to download and run. Benchmark results (shown below) cover SWE-Bench Verified, Terminal-Bench 2.0, BrowseComp, Humanity's Last Exam, and GPQA Diamond.
Context window
128k
Benchmarks
Coding
SWE-Bench VerifiedReal coding tasks pulled from open-source projects — the AI has to find and fix actual bugs. A human-checked version of the original SWE-Bench. Higher is better.
73.8%
Agentic terminal coding
Terminal-Bench 2.0Can the AI work in a command-line terminal — running commands and finishing technical setup tasks the way a developer would? (Version 2.0 of the test.) Higher is better.
41%
Web browsing
BrowseCompCan the AI browse the web and track down hard-to-find answers? Higher is better.
52%
Multidisciplinary reasoning
Humanity's Last ExamHumanity's Last Exam — extremely hard expert questions across many subjects, written so you can't just look up the answer. “No tools” means the AI answers on its own. Higher is better.
24.8%
no tools
42.8%
with tools
Science
GPQA DiamondGraduate-level science questions in biology, physics, and chemistry — hard enough that subject-matter PhDs score around 65%. Higher is better.
85.7%