This post was originally published on this site.
In brief
- Researchers show Anthropic-style exploits can be reproduced with public AI, report claims.
- Study suggests vulnerability discovery is already cheap and widely accessible.
- Findings indicate AI cyber capabilities may be spreading faster than expected.
When Anthropic unveiled Claude Mythos earlier this month, it locked the model behind a vetted coalition of tech giants and framed it as something too dangerous for the public. Treasury Secretary Scott Bessent and Fed Chair Jerome Powell convened an emergency meeting with Wall Street CEOs. The word “vulnpocalypse” resurfaced in security circles.
And now a team of researchers has further complicated that narrative.
Vidoc Security took Anthropic’s own patched public examples and tried to reproduce them using GPT-5.4 and Claude Opus 4.6 inside an open-source coding agent called opencode. No Glasswing invite. No private API access. No Anthropic internal stack.
“We replicated Mythos findings in opencode using public models, not Anthropic’s private stack,” Dawid Moczadło, one of the researchers involved in the experiment, wrote on X after publishing the results. “A better way to read Anthropic’s Mythos release is not ‘one lab has a magical model.’ It is: the economics of vulnerability discovery are changing.”
We replicated Mythos findings in opencode using public models, not Anthropic’s private stack.
The moat is moving from model access to validation: finding vulnerability signal is getting cheaper; turning it into trusted security
A better way to read Anthropic’s Mythos release is… https://t.co/0FFxrc8Sr1 pic.twitter.com/NjqDhsK1LA
— Dawid Moczadło (@kannthu1) April 16, 2026
The cases they targeted were the same ones Anthropic highlighted in its public materials: a server file-sharing protocol, the networking stack of a security-focused OS, the video-processing software embedded in almost every media platform, and two cryptographic libraries used to verify digital identities across the web.
Both GPT-5.4 and Claude Opus 4.6 reproduced two bug cases in all three runs each. Claude Opus 4.6 also independently rediscovered a bug in OpenBSD three times straight, while GPT-5.4 scored zero on that one. Some bugs (one involving the FFmpeg library to run videos and another involving the processing of digital signatures with wolfSSL) came back partial—meaning the models found the right code surface but didn’t nail the precise root cause.
Every scan stayed below $30 per file, meaning researchers were able to find the same vulnerabilities as Anthropic while spending less than $30 to do it.
“AI models are already good enough to narrow the search space, surface real leads, and sometimes recover the full root cause in battle-tested code,” Moczadło said on X.
The workflow they used wasn’t a one-shot prompt. It mirrored what Anthropic itself described publicly: give the model a codebase, let it explore, parallelize attempts, filter for signal. The Vidoc team built the same architecture with open tooling. A planning agent split each file into chunks. A separate detection agent ran on each chunk, then inspected other files in the repo to confirm or rule out findings.
The line ranges inside each detection prompt—for example, “focus on lines 1158-1215″—weren’t chosen by the researchers manually. They were outputs from the prior planning step. The blog post makes this explicit: “We want to be explicit about that because the chunking strategy shapes what each detection agent sees, and we do not want to present the workflow as more manually curated than it was.”
The study doesn’t claim public models match Mythos on everything. Anthropic’s model went further than just spotting the FreeBSD bug—it built a working attack blueprint, figuring out how an attacker could chain code fragments together across multiple network packets to seize full control of the machine remotely. Vidoc’s models found the flaw. They didn’t build the weapon. That’s where the real gap sits: not in finding the hole, but in knowing exactly how to walk through it.
But Moczadło’s argument isn’t really that public models are equally powerful. It’s that the expensive part of the workflow is now available to anyone with an API key: “The moat is moving from model access to validation: finding vulnerability signal is getting cheaper; turning it into trusted security work is still hard.”
Anthropic’s own safety report acknowledged that Cybench, the benchmark used to measure whether a model poses serious cyber risk, “is no longer sufficiently informative of current frontier model capabilities” because Mythos cleared it entirely. The lab estimated comparable capabilities would spread from other AI labs within six to 18 months.
The Vidoc study suggests the discovery side of that equation is already available outside any gated program. Their full prompt excerpts, model outputs, and methodology appendix are published at the lab’s official site.
Daily Debrief Newsletter
Start every day with the top news stories right now, plus original features, a podcast, videos and more.
