Z.ai
GLM-5Open Weight
GLM-5 is an AI model released by Z.ai on Feb 12 2026, 52 days after GLM-4.7. It is an open-weight model — the trained weights are available to download and run. Benchmark results (shown below) cover SWE-Bench Verified, SWE-Bench Multilingual, Terminal-Bench 2.0, BrowseComp, Humanity's Last Exam, and GPQA Diamond.
Parameters
744B
Benchmarks
Coding
SWE-Bench VerifiedReal coding tasks pulled from open-source projects — the AI has to find and fix actual bugs. A human-checked version of the original SWE-Bench. Higher is better.
77.8%
Multilingual coding
SWE-Bench MultilingualLike SWE-Bench, but the coding problems span many programming languages, not just one. Tests how broadly the AI can code. Higher is better.
73.3%
Agentic terminal coding
Terminal-Bench 2.0Can the AI work in a command-line terminal — running commands and finishing technical setup tasks the way a developer would? (Version 2.0 of the test.) Higher is better.
56.2%
Web browsing
BrowseCompCan the AI browse the web and track down hard-to-find answers? Higher is better.
75.9%
Multidisciplinary reasoning
Humanity's Last ExamHumanity's Last Exam — extremely hard expert questions across many subjects. “With tools” means the AI is allowed to search the web or run code while answering. Higher is better.
50.4%
with tools
Science
GPQA DiamondGraduate-level science questions in biology, physics, and chemistry — hard enough that subject-matter PhDs score around 65%. Higher is better.
86%