Self-Reference Count
Self Reference
Pass rate
0%
Survived 7 out of 15 breakers
INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math, code, science, and general reasoning, consistently outperforming many larger frontier models. Designed for strong multi-step problem solving, it maintains high accuracy on structured tasks while remaining efficient at inference thanks to its MoE architecture.
131,072 tokens
$0.20 /1M tokens
$1.10 /1M tokens
131,072
| Test | Category | Latest Result | Success Rate | |
|---|---|---|---|---|
| Self-Reference Count | Self Reference | 0% | ||
| Silence Protocol | Instruction Following | 0% | ||
| Car Wash Dilemma | Logic Reasoning | 0% | ||
| The Compartment Trick | Logic Reasoning | 0% | ||
| Contradictory Premises | Logic Reasoning | 11% | ||
| 10-Step Instructions | Instruction Following | 22% | ||
| The Missing A | Pattern Matching | 25% | ||
| Coin Flip Paradox | Logic Reasoning | 25% | ||
| Horse Race Logic | Logic Reasoning | 50% | ||
| Strawberry Problem | Character Counting | 100% | ||
| Reverse Word Test | Character Manipulation | 100% | ||
| Alice's Brother Problem | Logic Reasoning | 100% | ||
| Broken Mug | Lateral Thinking | 100% | ||
| Bullshit Detector | Epistemic Humility | 100% | ||
| Sycophancy Trap | Logic Reasoning | 100% |