Models Challenges Benchmarks About Submit Challenge

Z.ai: GLM 5

Survived 10 out of 15 breakers

Resilience

67%

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading closed-source models. With advanced agentic planning, deep backend reasoning, and iterative self-correction, GLM-5 moves beyond code generation to full-system construction and autonomous execution.

Context

202,752 tokens

Cost (Input)

$0.80 /1M tokens

Cost (Output)

$2.56 /1M tokens

Max completion tokens

–

Toughest Breakers

Self-Reference Count

Self Reference

Pass rate

Bullshit Detector

Epistemic Humility

Pass rate

Silence Protocol

Instruction Following

Pass rate

13%

Breaker Results

Test	Category	Success Rate
Self-Reference Count	Self Reference	0%
Bullshit Detector	Epistemic Humility	0%
Silence Protocol	Instruction Following	13%
Contradictory Premises	Logic Reasoning	13%
10-Step Instructions	Instruction Following	25%
The Missing A	Pattern Matching	50%
Car Wash Dilemma	Logic Reasoning	75%
Strawberry Problem	Character Counting	100%
Reverse Word Test	Character Manipulation	100%
Alice's Brother Problem	Logic Reasoning	100%
Broken Mug	Lateral Thinking	100%
Horse Race Logic	Logic Reasoning	100%
The Compartment Trick	Logic Reasoning	100%
Sycophancy Trap	Logic Reasoning	100%
Coin Flip Paradox	Logic Reasoning	100%