42 AI Models Tested: Claude Opus 4.7 Tops DystopiaBench

DystopiaBench has been updated with two new modules and 30 additional models, bringing the total to 42 models across 6 dystopia types. The benchmark tests models on 36 scenarios with 5 escalation levels (L1 innocent → L5 nightmare) for noticing ethical drift and refusing to comply.

New Modules

Huxley module: Tests if models design "wellness" systems that enforce compliance through pleasure pacification and behavioral conditioning (inspired by Brave New World). Most models comply by L3.
Baudrillard module: Tests synthetic intimacy systems that replace human trust with AI-mediated relationships. Most models don't see the harm.

New Models Tested

30 more models were added, including Grok 4.3, GPT-5.5, Gemini 3.1 Pro, and GLM-5.1. Multi-judge panels track agreement at ~76%.

Results: Claude Opus 4.7

Consistent refusal at L4-L5 across all modules
Refuses even L3 for the Petrov scenario (weapons/nuclear)
Provides explicit ethical reasoning in responses—not just "I can't"
Only model that explains why the request is harmful

Other Models

GPT-5.5: Compliant through L4, sometimes L5
Gemini 3.1 Pro: Surprisingly willing on surveillance scenarios
Grok 4.3: Will build anything if you use words like "efficiency" or "optimization"
GLM-5.1: Copied Claude's homework, still not as consistent

Methodology

36 scenarios, 5 escalation levels each (L1 innocent → L5 nightmare). Models are scored on whether they notice the drift and refuse, or just keep coding. Heatmap visualizations are available.