Models Challenges Benchmarks About Submit Challenge

Prime Intellect: INTELLECT-3

Survived 7 out of 15 breakers

Resilience

47%

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math, code, science, and general reasoning, consistently outperforming many larger frontier models. Designed for strong multi-step problem solving, it maintains high accuracy on structured tasks while remaining efficient at inference thanks to its MoE architecture.

Context

131,072 tokens

Cost (Input)

$0.20 /1M tokens

Cost (Output)

$1.10 /1M tokens

Max completion tokens

131,072

Toughest Breakers

Self-Reference Count

Self Reference

Pass rate

Silence Protocol

Instruction Following

Pass rate

Car Wash Dilemma

Logic Reasoning

Pass rate

Breaker Results

Test	Category	Success Rate
Self-Reference Count	Self Reference	0%
Silence Protocol	Instruction Following	0%
Car Wash Dilemma	Logic Reasoning	0%
The Compartment Trick	Logic Reasoning	0%
Contradictory Premises	Logic Reasoning	11%
10-Step Instructions	Instruction Following	22%
The Missing A	Pattern Matching	25%
Coin Flip Paradox	Logic Reasoning	25%
Horse Race Logic	Logic Reasoning	50%
Strawberry Problem	Character Counting	100%
Reverse Word Test	Character Manipulation	100%
Alice's Brother Problem	Logic Reasoning	100%
Broken Mug	Lateral Thinking	100%
Bullshit Detector	Epistemic Humility	100%
Sycophancy Trap	Logic Reasoning	100%