News Daily Nation Digital News & Media Platform

collapse
Home / Daily News Analysis / Can Anthropic Keep Its Exploit-Writing AI Out of the Wrong Hands?

Can Anthropic Keep Its Exploit-Writing AI Out of the Wrong Hands?

May 20, 2026  Twila Rosenbaum  56 views
Can Anthropic Keep Its Exploit-Writing AI Out of the Wrong Hands?

Anthropic's latest language model, Claude Mythos Preview, represents a major leap in artificial intelligence capabilities, particularly in the realm of cybersecurity. The model can identify and exploit zero-day vulnerabilities across virtually every major operating system and web browser, including subtle flaws that have remained hidden for decades. One example cited by Anthropic involved a 27-year-old vulnerability in OpenBSD that Mythos successfully exploited. While the company positions the tool as a way to strengthen defenses, the dual-use nature of such technology has ignited fierce debate over how to prevent it from falling into the wrong hands.

What Mythos Preview Can Do

According to Anthropic's announcement on April 7, 2026, Mythos Preview was not explicitly designed for security work. Its exploit-writing skills emerged as a byproduct of improvements in code generation and reasoning. The model autonomously wrote a web browser exploit that chained together four separate vulnerabilities, using a complex JIT heap spray to escape both renderer and operating system sandboxes. It also obtained local privilege escalation on Linux and other platforms by exploiting race conditions and KASLR bypasses, and it developed a remote code execution exploit for FreeBSD's NFS server that gave full root access to unauthenticated users via a 20-gadget ROP chain spread over multiple packets.

These feats are not limited to theoretical exercises. Anthropic claims that Mythos Preview has already identified thousands of high-risk and critical vulnerabilities across a range of software, and the company says it is responsibly disclosing them to affected vendors. The model's ability to both find and exploit flaws means that defenders can use it to discover issues before attackers do, but the same capability could be turned around to launch devastating attacks.

Project Glasswing: A Defensive Push

Recognizing the inherent risks, Anthropic launched Project Glasswing in tandem with the model's release. This initiative brings together more than 40 organizations, including Apple, AWS, Microsoft, Palo Alto Networks, and CrowdStrike, to use Mythos Preview for scanning and securing first-party and open-source systems. Palo Alto Networks chief product and technology officer Lee Klarich described early results as compelling, though details remain limited. Anthropic is committing $100 million in Mythos Preview usage credits to Project Glasswing, along with $4 million in direct donations to open source security organizations.

The goal is to fundamentally reshape cybersecurity by making vulnerability discovery faster and more automated. However, skeptics point out that similar promises have been made before with other penetration testing tools, such as Cobalt Strike, which was originally designed for legitimate red teaming but is now widely used by ransomware groups and nation-state actors. The history of cybersecurity is littered with tools that escaped their intended boundaries.

Expert Perspectives on Risk and Control

Forrester senior analyst Erik Nost told reporters that Anthropic's announcement serves dual purposes. It generates positive publicity by demonstrating the model's extraordinary capabilities, and it highlights the massive gap in vulnerability management that the industry has faced for three decades. Nost emphasized that controls are in place to keep Mythos in the right hands, but noted that the race has now become a matter of patching vulnerabilities before other AI systems in adversarial hands discover them and write exploits at machine speed.

Julian Totzek-Hallhuber, senior principal solution architect at Veracode, expressed a more cautious view. He argued that no clear answer exists for how such tools can stay out of attacker hands, so defenders should assume the capability will proliferate. He advised investing in detection rather than just prevention, identifying behavioral signatures of AI-assisted exploitation, and adopting zero-trust architectures along with aggressive patching cycles and anomaly-based detection.

Melissa Ruzzi, director of AI at AppOmni, echoed this sentiment, stating that absolute security is impossible. The best that can be done is to make it more difficult for attackers to access these tools. Her perspective aligns with a long-standing truth in cybersecurity: any system designed for defense can be repurposed for offense.

The Challenge of Independent Verification

One major issue with Mythos Preview is the lack of independent testing. Anthropic controls both the model and the narrative; because the model is not publicly available, no outside researcher can confirm the company's claims. Totzek-Hallhuber noted that until independent researchers with access can run their own evaluations, healthy skepticism is the appropriate posture. He called this a consequence of the restricted access model: the claims cannot be tested, so they cannot be fully trusted or refuted.

Dark Reading attempted to obtain statistics regarding false positives and error rates from Anthropic, but the vendor did not respond. This lack of transparency raises concerns about the model's reliability in real-world scenarios. False positives could overwhelm security teams, while false negatives could leave critical vulnerabilities unpatched.

Historical Context and the Dual-Use Dilemma

The dilemma posed by Mythos Preview is not new. Throughout the history of computing, powerful tools have emerged with both benevolent and malicious potential. The very first computer worms, such as the Morris worm of 1988, were intended as experiments but caused widespread damage. More recently, penetration testing frameworks like Metasploit and Cobalt Strike have become staples of both red teams and cybercriminals. The same story repeats with artificial intelligence: large language models can generate code, write malware, and compose phishing emails.

Anthropic positions itself as a safety-first company, but the release of Mythos Preview challenges that ethos. The company says it has implemented various controls, such as restricting access to trusted partners and monitoring usage. Yet the history of software suggests that determined adversaries will eventually find ways to obtain or replicate the technology. Open source alternatives, model theft, and insider threats all pose real risks.

Moreover, the speed at which AI can operate changes the dynamics of vulnerability exploitation. In the past, attackers needed significant manual effort to discover and weaponize zero-days. Now, a model like Mythos can do it in minutes. Defenders must adapt by automating their own responses, using AI-driven detection systems, and closing the window of exposure.

Some industry observers have called for regulation of advanced AI capabilities, similar to export controls on cryptographic software. Anthropic itself has advocated for responsible AI governance, but concrete policy action has been slow. The company's decision to release Mythos Preview before those policies are in place has drawn criticism from some quarters, even as others applaud the defensive potential.

As the cybersecurity community grapples with these questions, one thing is clear: the genie is out of the bottle. Whether through Mythos Preview, competing models from other AI firms, or open-source projects, the ability to write exploits with machine assistance is here to stay. Defenders must prepare for a world where vulnerability discovery is automated and exploitation is accelerated. The only way to stay ahead is to embrace similar technology for defense, while remaining vigilant about the risks.

Anthropic's Mythos Preview may prove to be a watershed moment in the ongoing struggle between security and convenience, but its ultimate impact will depend on how well the community can balance innovation with control. The next few months will reveal whether Project Glasswing can deliver on its promise of reshaping cybersecurity for the better, or whether it will join the long list of tools that escaped their intended purpose.


Source: Dark Reading News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy