Anthropic's overhyped Mythos falling before scrutiny

On 7 April 2026, Anthropic announced a new model called Mythos. The company claimed that the Mythos model finds and exploits software vulnerabilities at a level that would make a public release too risky [1], [2]. Three weeks later, independent researchers and journalists are dismantling the central narrative step by step.

Too dangerous?

In this article, we'll see how it compares it with the already existing academic literature on large language models in cybersecurity, and names the open question that the affair raises.

Anthropic described the Mythos model as a tool that autonomously finds zero-day vulnerabilities in every major operating system and browser [2]. The company founded a programme called Glasswing and invited approximately fifty organisations, including Apple, Google, Microsoft, AWS, Cisco, and JPMorgan Chase, to examine their own systems using the model. Anthropic provided up to one hundred million US dollars in usage credits and four million in direct grants to open-source security organisations [2].

The central message was one of responsible restraint. The company decided not to make the Mythos model freely available because release to the general public would, according to the company's claim, transfer offensive capability to criminals.

The researcher Patrick Garrity of VulnCheck searched the CVE [Common Vulnerabilities and Exposures] database to find vulnerabilities linked to Glasswing. He found at most forty possible records, and possibly none that can be attributed to the programme with any actual certainty [3]. The company claim of "thousands" of severely rated vulnerabilities comes from only one hundred and ninety-eight manually reviewed reports. This would be approximately four to ten percent of the declared total number [5], [6].

Detailed analysis of the 244-page system card reveals further problems:

The flagship Firefox demonstration did not test a real browser. Anthropic used an isolated SpiderMonkey environment without the sandbox protections and without other defences [5], [6].
The fifty bugs in the test corpus were found by Opus 4.6, not by the Mythos model [5].
When the two most easily weaponised bugs were removed, the success rate fell from 72.4 to 4.4 percent [5].
The FreeBSD exploit required considerable human guidance and does not execute "from anywhere on the internet", the attacker needs valid Kerberos authentication [6].
The Linux kernel bug that the press attributed to the Mythos model was actually found by Opus 4.6, the already public model [6].
The system card reveals that the Mythos model failed against a correctly configured environment with modern patches and against an industrial control system [5].

Mozilla reported that the Mythos model found 271 vulnerabilities in Firefox 150. Chief Technology Officer Bobby Holley described the result as a turning point for defenders. He nevertheless acknowledged that none of the findings was unfindable by an expert human researcher [4]. The model accelerates the work; it does not create a new class of threat. This hypothisis appears to be supported by recent academic studies.

The academic literature points to the same pattern: agents help, but the results are still uneven. ReAct-style systems perform better because they combine a model with external tools, which supports the view that Mythos may owe much of its capability to scaffolding rather than to the model alone [6], [7]. The same boundary appears in PentestGPT, where 2024-era models understood security tasks better than they could reliably produce working exploit code [8].

The important distinction is between finding a weakness and building a working exploit chain. GPT-4 exploited 87 percent of known vulnerabilities when given the bug description, but only 7 percent without it [9]. That matters because vulnerability detection is dual-use: the same capability can help attackers and defenders, which fits Mozilla’s account more closely than Anthropic’s alarmed release narrative [2], [4], [10].

The benchmarks also argue for caution. Eyeballvul was designed to reduce training-data contamination, a risk the Mythos system card itself acknowledges [6], [11]. SecVulEval reported only a 23.83 percent F1 score for Claude 3.7 Sonnet on vulnerable-line detection with correct reasoning [12]. Against that baseline, a claimed leap to “too dangerous to release” needs stronger evidence than a launch story.

The cybersecurity consultant Davi Ottenheimer raises a question that transcends the technical debate: who has the right to decide that a model capability is "too dangerous" for public release [5]? Through the Glasswing structure, Anthropic assigns privileged early access to the largest firms in the cybersecurity market. This private classification regime has no statutory basis, no parliamentary oversight, and no neutral arbitrator [5].

The financial structure weakens the "defensive investment" narrative. Devansh notes that five of Anthropic's eleven external launch partners are also investors in the company itself. JPMorgan functions simultaneously as a launch partner and as lead underwriter for the reported October 2026 IPO [6]. The one hundred million dollars is not cash. It is usage credits, calculated according to the official prices of the same model that the partners are supposed to validate [5].

The evidential basis for the "too dangerous to release" narrative is weak. The pre-existing academic literature had already shown that modern models find bugs, that autonomous agents improve exploitation, and that the gap between detection and weaponisation remains significant [7]–[12]. The Mythos announcement is a consistent development along that trajectory, not a step across a line.

The more important question is not whether the model is truly dangerous. The more important question is whether a private company has the right to unilaterally classify a capability, assign privileged access to selected collaborators, and build a parallel disclosure regime - all without democratic oversight or independent observation [5].

And there we have it... AI is simply taking the mantle from social media companies.

References

[1] J. Lyons, "Anthropic's super-scary bug hunting model Mythos is shaping up to be a nothingburger," The Register, Apr. 22, 2026. [Online]. Available: https://www.theregister.com/2026/04/22/anthropic_mythos_hype_nothingburger/

[2] T. Claburn, "Anthropic: All your zero-days are belong to Mythos," The Register, Apr. 7, 2026. [Online]. Available: https://www.theregister.com/2026/04/07/anthropic_all_your_zerodays_are_belong_to_us/

[3] J. Lyons, "Nobody knows how many CVEs Anthropic's Project Glasswing has actually found," The Register, Apr. 15, 2026. [Online]. Available: https://www.theregister.com/2026/04/15/project_glasswing_cves/

[4] S. Sharwood, "Mythos found 271 Firefox flaws – but none a human couldn't spot," The Register, Apr. 22, 2026. [Online]. Available: https://www.theregister.com/2026/04/22/mozilla_firefox_mythos_future_defenders/

[5] D. Ottenheimer, "The Boy That Cried Mythos: Verification is Collapsing Trust in Anthropic," flyingpenguin, Apr. 13, 2026. [Online]. Available: https://www.flyingpenguin.com/the-boy-that-cried-mythos-verification-is-collapsing-trust-in-anthropic/

[6] Devansh, "Anthropic's Claude Mythos Launch Is Built on Misinformation," Artificial Intelligence Made Simple, Apr. 17, 2026. [Online]. Available: https://www.artificialintelligencemadesimple.com/p/anthropics-claude-mythos-launch-is

[7] A. Yildiz, S. G. Teo, Y. Lou, Y. Feng, C. Wang, and D. M. Divakaran, "Benchmarking LLMs and LLM-based Agents in Practical Vulnerability Detection for Code Repositories," in Proc. 63rd Annu. Meeting Assoc. Comput. Linguistics (ACL), Vienna, Austria, 2025, pp. 30848–30865. [Online]. Available: https://aclanthology.org/2025.acl-long.1490/

[8] G. Deng, Y. Liu, V. Mayoral-Vilches, P. Liu, Y. Li, Y. Xu, T. Zhang, Y. Liu, M. Pinzger, and S. Rass, "PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing," in 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA, 2024, pp. 847–864. [Online]. Available: https://www.usenix.org/system/files/usenixsecurity24-deng.pdf

[9] R. Fang, R. Bindu, A. Gupta, and D. Kang, "LLM Agents can Autonomously Exploit One-day Vulnerabilities," arXiv preprint, arXiv:2404.08144, 2024. [Online]. Available: https://arxiv.org/abs/2404.08144

[10] M. Phuong, M. Aitchison, E. Catt, S. Cogan, A. Kaskasoli, V. Krakovna, and others, "Evaluating Frontier Models for Dangerous Capabilities," arXiv preprint, arXiv:2403.13793, 2024. [Online]. Available: https://arxiv.org/abs/2403.13793

[11] T. Chauvin, "eyeballvul: a future-proof benchmark for vulnerability detection in the wild," arXiv preprint, arXiv:2407.08708, 2024. [Online]. Available: https://arxiv.org/abs/2407.08708

[12] M. M. Rahman, S. Sahin, and K. Damevski, "SecVulEval: Benchmarking LLMs for Real-World C/C++ Vulnerability Detection," arXiv preprint, arXiv:2505.19828, 2025. [Online]. Available: https://arxiv.org/abs/2505.19828