AISI Evaluation Shows Claude Mythos Preview's Cyber Capabilities in CTF and Multi-Step Attacks

The AI Security Institute (AISI) conducted cyber evaluations of Anthropic's Claude Mythos Preview, assessing its performance on capture-the-flag challenges and multi-step attack simulations. The model showed significant improvement over previous frontier models in cybersecurity capabilities.
Capture-the-Flag Results
In CTF challenges where models must identify and exploit weaknesses to retrieve hidden flags, Mythos Preview achieved 73% success rate on expert-level tasks. These expert-level tasks were ones that no model could complete before April 2025. The evaluation compared performance across difficulty levels from technical non-expert to expert, with models tested using token budgets up to 50M tokens.
Cyber Range Results
AISI built "The Last Ones" (TLO), a 32-step corporate network attack simulation spanning initial reconnaissance through full network takeover, estimated to require humans 20 hours to complete. Claude Mythos Preview was the first model to solve TLO from start to finish, succeeding in 3 out of 10 attempts. Across all attempts, the model completed an average of 22 out of 32 steps.
Claude Opus 4.6 was the next best performing model, completing an average of 16 steps. The evaluation used token budgets up to 100M tokens, with performance continuing to scale up to this limit.
Limitations and Context
The model could not complete the operational technology focused cyber range 'Cooling Tower', though it got stuck on IT sections rather than OT-specific parts. AISI notes that two years ago, the best available models could barely complete beginner-level cyber tasks, while now, in controlled evaluations where Mythos Preview was explicitly directed and given network access, it could execute multi-stage attacks on vulnerable networks and discover and exploit vulnerabilities autonomously.
📖 Read the full source: HN AI Agents
👀 See Also

Cybercriminals Are Pushing Back Against AI-Generated Slop on Underground Forums
New research shows low-level hackers and scammers are complaining about AI-generated posts on cybercrime forums, viewing them as low-quality noise that undermines community trust and social interaction.

Anthropic's Computer-Use Feature Triggers Governance Lockdown in Real Test
Anthropic shipped computer-use capabilities, and during implementation of governance controls, a risk threshold triggered a LOCKDOWN posture that blocked all mutating operations including the operator's own governance work.

AI Agents Enable Solo Hackers to Breach Governments and Ransomware Campaigns
A solo operator using Claude Code and ChatGPT exfiltrated 150 GB from Mexican government agencies, including 195 million taxpayer records. Another attacker used Claude Code to run an end-to-end extortion campaign against 17 healthcare and emergency services organizations.

EctoClaw: Safety Tool for OpenClaw Agents with Terminal Access
EctoClaw is a free open source safety tool for OpenClaw that checks every action four times before execution, runs actions in a strong sandbox, and records everything with proof.