OpenAI's GPT-5.5 matches Anthropic's Mythos in cybersecurity tests

OpenAI's GPT-5.5 has demonstrated cybersecurity capabilities nearly identical to Anthropic's specialized Mythos Preview model, according to new evaluations from the UK’s AI Security Institute (AISI).

The findings, reported by Ars Technica, suggest that the high-level hacking prowess previously attributed to Anthropic's restricted-release model may be a feature of general model improvements rather than a unique breakthrough.

Since 2023, the AISI has tested various frontier AI models using 9-5 different 'Capture the Flag' challenges. These tests evaluate specific skills including cryptography, web exploitation, and reverse engineering.

On the highest-level 'Expert' tasks, GPT-5.5 achieved an average success rate of 71.4 percent. This figure is slightly higher than the 68.6 percent recorded by Mythos Preview, though the difference falls within the margin of error.

In one high-difficulty challenge requiring the creation of a disassembler to decode a Rust binary, GPT-5.5 completed the task in 10 minutes and 22 seconds. The process required no human assistance and cost approximately $1.73 in API calls.

Simulated network attacks

GPT-5.5 also matched the performance of Mythos Preview in 'The Last Ones' (TLO), a test range designed to simulate a 32-step data extraction attack against a corporate network. GPT-5.5 succeeded in three out of ten attempts, while Mythos Preview succeeded in two of ten.

No previously tested AI model has managed to succeed in this specific test even once.

However, the model failed to breach the 'Cooling Tower' simulation, which tests an AI's ability to disrupt control software for a power plant. This failure matches the performance of all other AI models tested by the institute to date.

The results suggest that the cybersecurity capabilities seen in Anthropic's Mythos Preview might be a 'byproduct of more general improvements' in large language models rather than a specific breakthrough for a single model, according to the report.

OpenAI's GPT-5.5 matches Anthropic's Mythos in cybersecurity tests

Simulated network attacks

Comments

Keep reading

More from Cybersecurity

Latest news

OpenAI's GPT-5.5 matches Anthropic's Mythos in cybersecurity tests

Simulated network attacks

Keep reading

More from Cybersecurity

OpenAI GPT-5.5 matches Claude in cyberattack potential, AI Security Institute reports

Hacktivists target Ubuntu and Canonical with massive DDoS attack

Canonical reports sustained cross-border attack on Ubuntu web infrastructure

Latest news

Michael Saylor plans to destroy Bitcoin private keys after death to increase scarcity

Zach Cregger Resident Evil film set during RE2 timeline without character crossovers

Mobile privacy advocates push for alternatives as Apple and Google tighten OS controls