April 13, 2026
Insights

From EVMbench to Mythos: AI Is Finding Bugs Faster Than Humans Can Fix Them

Anthropic's Claude Mythos Preview discovered zero-day vulnerabilities in every major operating system and browser. For DeFi, the implications extend well beyond smart contracts.

Hypernative

In February, OpenAI's EVMbench showed AI exploiting known smart contract bugs 72% of the time. Three months later, Anthropic's Claude Mythos Preview is discovering vulnerabilities that human researchers missed for decades. The trajectory is clear, and it is accelerating.

Anthropic last week unveiled Mythos Preview, a frontier AI model that autonomously identifies and exploits zero-day vulnerabilities in every major operating system and every major web browser. The findings include a 27-year-old bug in OpenBSD, an operating system built specifically for security. Anthropic generated working proof-of-concept exploits in more than 90% of cases. The capabilities were not explicitly trained. They emerged as a downstream consequence of improvements in code reasoning and agent autonomy.

The reaction from the digital asset industry was immediate. Haseeb Qureshi, managing partner at Dragonfly, called Mythos "COVID for software" and "actually apocalyptic in the wrong hands." Nic Carter, a founding partner at Castle Island Ventures: "What happens when this thing comes out? Is it that every smart contract just gets exploited?" 

The leap from EVMbench to Mythos is qualitative, not incremental. EVMbench measured AI performance against known vulnerability patterns in a sandboxed environment. Mythos discovers bugs that no human or tool had found, in production software under continuous review for years. Anthropic has restricted access to roughly 40 vetted partners through a program called Project Glasswing, but the company itself acknowledges that models with similar capabilities will become broadly available within months.

For DeFi and digital asset organizations, the implication is straightforward: assume that deployed code contains undiscovered vulnerabilities, because AI will find them, and the cost of doing so is dropping fast.

What we are already seeing onchain

Security researchers at Hypernative have observed an uptick in DeFi attacks since large language models became widely available. The concern that keeps the team up at night is the prospect of AI surfacing decade-old bug classes in thoroughly vetted blue-chip protocols, a scenario that now has a direct analogue in Mythos finding the OpenBSD bug.

The picture on the ground is more nuanced than "AI supercharges hackers." Hypernative's team has recently tracked a rise in what might be called vibe-coded exploits: attacks where a would-be hacker appears to have used an LLM to develop an attack contract, gone onchain with a full setup, and watched it fail on contact with reality. In one case, the LLM apparently convinced the attacker he had a working exploit and, in the process, accidentally uploaded the attack contract's source code to Etherscan.

These failures are temporary. The same improvement curve that brought AI from near-zero exploit capability to Mythos-level performance in months will close the gap between clumsy LLM-assisted attempts and sophisticated, working exploits. And even the failures matter: they confirm that AI is lowering the barrier to attempting attacks at scale, increasing the volume of exploit attempts that monitoring platforms must detect and triage.

Runtime defense in a Mythos world

Whether an exploit was designed by a human researcher over weeks or generated by an AI model in hours, Hypernative's detection is indifferent to origin. What matters is the onchain behavioral signature, and that signature is observable regardless of how the attack was constructed.

Mythos is an extraordinarily powerful code auditor. What it does not do is monitor live environments, detect attacks in progress, or respond to exploits executing onchain. Those require fundamentally different infrastructure.

Even the most thorough vulnerability discovery program cannot anticipate every way a deployed contract will behave once it becomes composable with the broader DeFi ecosystem. Governance parameters change, liquidity shifts, and new integrations create attack surfaces that did not exist at audit time.

Hypernative operates at the runtime layer. Regardless of the exploit's provenance, the execution onchain follows observable patterns that Hypernative's ML models detect regardless of origin:

  • Wallet funding through mixers or centralized exchanges
  • Deployment of attack contracts and abnormal interaction sequences
  • Flash loan acquisition, oracle manipulation, unusual function call patterns
  • Multi-step exploit chains spanning multiple protocols and chains

Detection alone is insufficient when attackers iterate at machine speed. Hypernative's automated response executes onchain within seconds and without human intervention: pausing contracts, pulling liquidity, blocking malicious transactions pre-signing via Hypernative Guardian.

What this means going forward

The security architecture for the next 12 to 24 months will have two complementary, autonomous layers: continuous AI-powered red teaming at the code level, and continuous runtime defense at the behavioral level. Organizations that treat one as a substitute for the other will find themselves exposed.

Mythos validates a thesis Hypernative has held since its founding: the runtime defense layer is a core operational requirement for any organization with meaningful onchain exposure. AI is making that case faster than anyone anticipated.

Reach out for a demo of Hypernative’s solutions, tune into Hypernative’s blog and our social channels to keep up with the latest on cybersecurity in Web3.

Secure everything you build, run and own in Web3 with Hypernative.

Website | X (Twitter) | LinkedIn

Secure everything you build, run, and, own onchain

Book a demo