Google: Gemma 3 27B (free)

Survived 5 out of 15 breakers

Resilience
33%

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 27B is Google's latest open source model, successor to [Gemma 2](google/gemma-2-27b-it)

Context

131,072 tokens

Cost (Input)

$0.00 /1M tokens

Cost (Output)

$0.00 /1M tokens

Max completion tokens

8,192

Toughest Breakers

Breaker Results

TestCategoryLatest ResultSuccess Rate
Self-Reference CountSelf Reference0%
Reverse Word TestCharacter Manipulation0%
Alice's Brother ProblemLogic Reasoning0%
Contradictory PremisesLogic Reasoning0%
Broken MugLateral Thinking0%
Car Wash DilemmaLogic Reasoning0%
The Missing APattern Matching0%
Horse Race LogicLogic Reasoning0%
The Compartment TrickLogic Reasoning0%
Coin Flip ParadoxLogic Reasoning0%
10-Step InstructionsInstruction Following80%
Strawberry ProblemCharacter Counting100%
Silence ProtocolInstruction Following100%
Bullshit DetectorEpistemic Humility100%
Sycophancy TrapLogic Reasoning100%