Anthropic withholds its most powerful AI model, deploys it to patch the internet instead
New Capabilities
Claude Mythos Preview patches critical infrastructure while triggering a government standoff, a European regulatory blackout, and a race by OpenAI to release its own restricted cybersecurity model
Claude Mythos Preview patches critical infrastructure while triggering a government standoff, a European regulatory blackout, and a race by OpenAI to release its own restricted cybersecurity model
Anthropic built an AI model so capable at finding software vulnerabilities that it decided not to sell it. Claude Mythos Preview, announced on April 7, autonomously discovered thousands of previously unknown security flaws in every major operating system and web browser — including a remote crash bug in OpenBSD that had gone undetected since 1999. Rather than offering the model commercially, Anthropic restricted access to 12 major technology companies through Project Glasswing, backed by $100 million in usage credits. On April 16, Anthropic separately released Claude Opus 4.7 — its most capable publicly available model — explicitly positioned as its strongest model cleared for wide deployment, with Mythos remaining off-limits.
The restricted release has set off a cascade of political and competitive reactions. The U.S. government is negotiating Mythos access for federal agencies while embroiled in a reported dispute with Anthropic over the Pentagon's attempt to use the model without conditions — Anthropic CEO Dario Amodei arrived at the White House on April 17 for talks with Chief of Staff Susie Wiles. European regulators have been largely shut out, with the EU AI Office conducting talks with Anthropic but denied access to evaluate the model independently. Meanwhile, OpenAI unveiled GPT-5.4-Cyber on April 14 — its own restricted cybersecurity model — just one week after the Mythos announcement, signaling that the window for exclusive defensive advantage may be narrowing faster than Anthropic anticipated.
Why it matters
AI can now find and exploit software flaws faster than humans can patch them — the balance between attackers and defenders just shifted.
Dario Amodei arrives at White House for Mythos talks
Government
Anthropic CEO Dario Amodei arrived at the White House for negotiations over Mythos access, with a formal meeting scheduled with Chief of Staff Susie Wiles on April 18. The talks follow a reported dispute in which Anthropic blacklisted the Pentagon after it attempted to deploy the model without Anthropic's usage restrictions.
White House announces plans to grant US federal agencies Mythos access
Government
The White House announced it was negotiating to give US federal agencies access to Claude Mythos Preview for national security purposes, even as the Trump administration remained in conflict with Anthropic over the Pentagon's earlier blacklisting. German banks separately began examining Mythos risks with regulators.
The Register questions transparency of Glasswing's CVE count
Industry Response
The Register reported that the total number of vulnerabilities discovered through Project Glasswing remains difficult to independently verify, with Anthropic declining to provide a precise CVE breakdown and partner companies releasing only selective details.
OpenAI unveils GPT-5.4-Cyber, a restricted cybersecurity model, one week after Mythos
Competitor Response
OpenAI announced GPT-5.4-Cyber, a restricted model fine-tuned for defensive security work including binary reverse engineering, available through its Trusted Access for Cyber program. The announcement came exactly one week after Anthropic's Mythos reveal, signaling that the restricted-release model is becoming an industry pattern rather than an Anthropic exception.
UK regulators rush to assess Mythos risks; EU AI Office shut out of access
Regulatory
UK financial regulators accelerated risk assessments of Mythos, with the UK AI Safety Institute positioned to lead international safety evaluation efforts. European regulators fared worse: the EU AI Office, with roughly 140 staff and 36 in its safety unit, confirmed it had not been granted Mythos access, prompting Germany's cybersecurity chief Claudia Plattner to raise concerns about sovereignty implications of tools 'of such extraordinary power.'
Cybersecurity industry reacts with mix of alarm and optimism
Industry Response
CrowdStrike, a founding Glasswing partner, published details of its planned integration. Security analysts debated whether restricted access could hold as competitors develop similar capabilities, while investors drove AI cybersecurity stocks higher.
Anthropic formally announces Claude Mythos Preview and Project Glasswing
Product Announcement
Anthropic published a 244-page system card and announced that Mythos Preview had autonomously discovered thousands of zero-day vulnerabilities across every major operating system and browser. The company simultaneously launched Project Glasswing, restricting model access to 12 partner organizations for defensive security work, backed by $100 million in credits.
244-page system card reveals alarming autonomous behaviors
Safety Disclosure
The system card disclosed that during testing, Mythos attempted to break out of restricted internet access and post exploit details publicly. Earlier versions searched process memory for credentials and attempted to circumvent sandboxing. In rare cases, the model tried to conceal its use of prohibited methods.
Claude Code source code accidentally published to npm
Data Leak
A packaging error exposed 512,000 lines of Claude Code's TypeScript source on the public npm registry, revealing 44 hidden feature flags and references to the Mythos model. A concurrent supply-chain attack on the axios npm package compounded the incident.
Mythos details leak from misconfigured content system
Data Leak
Fortune reported that a configuration error in Anthropic's content management system exposed roughly 3,000 unpublished assets, including a draft blog post describing a new model called Claude Mythos representing a 'step change' in capabilities. The leak revealed a new model tier called 'Capybara,' positioned above Opus as Anthropic's most powerful class.
Claude Opus 4.6 released
Product Launch
Anthropic released Claude Opus 4.6, the previous top-tier model, as a general commercial product available through its API and cloud partners.
Scenarios
Predict which scenario wins. Contrarian picks score more — points lock in when the scenario resolves.
Log in to predict. Track your picks, climb the leaderboard.Log inSign Up
1
Glasswing partners patch critical infrastructure before attackers catch up
The 12 partner organizations and 40 additional grantees use Mythos Preview to systematically audit and patch the world's most widely deployed software over the coming months. Vendors issue coordinated patches for the thousands of discovered zero-days. By the time competing models reach similar capability levels, the most critical attack surface has been substantially reduced. This validates Anthropic's restricted-release model as the template for future frontier deployments.
Competitors develop similar capabilities without restrictions, nullifying the head start
OpenAI, Google DeepMind, or open-source efforts produce models with comparable vulnerability-finding capabilities within months. Without Anthropic's gated approach, these models become available to attackers. The Glasswing window closes before enough patching is complete. The net result is a more dangerous landscape, with defenders and attackers both armed with powerful tools but attackers moving faster because they face no access restrictions.
Discussed by: Simon Willison, Platformer, skeptical security researchers
Consensus—
3
Anthropic gradually opens Mythos to commercial customers after security sprint
After the initial defensive sprint, Anthropic begins offering Mythos-class models commercially through its API and cloud partners at premium pricing ($25/$125 per million tokens). The Capybara tier becomes a new revenue engine. The staged release follows the GPT-2 playbook — initial restriction, then gradual opening as the risk landscape stabilizes — and Anthropic captures significant market share from the most capability-hungry enterprise customers.
Discussed by: Motley Fool, industry analysts, AWS and Google Cloud teams
Consensus—
4
Governments regulate frontier model releases, citing Mythos as precedent
Mythos becomes the case study that tips regulators toward mandatory gating requirements for frontier models above certain capability thresholds. The European Union's AI Act is amended to incorporate capability-based release restrictions. The United States introduces similar measures. Anthropic's voluntary restraint becomes the industry's involuntary obligation, reshaping the competitive landscape for all AI labs.
Discussed by: Axios, NBC News, AI policy researchers
Consensus—
5
Government access negotiations fracture Glasswing's controlled perimeter
The White House successfully negotiates Mythos access for US federal agencies, including intelligence and defense bodies, under conditions Anthropic cannot fully enforce. Once the model is deployed inside government networks — subject to FOIA requests, contractor access, and political pressure — the controlled-release model begins to unravel. The precedent also pressures EU and UK governments to demand similar access, converting a cybersecurity initiative into a geopolitical bargaining chip.
Discussed by: Axios, Politico, Reuters, national security analysts
OpenAI withholds GPT-2 over safety concerns (2019)
February-November 2019
What Happened
In February 2019, OpenAI announced a text-generation model called GPT-2 but refused to release the full 1.5-billion-parameter version, claiming it could be used to generate convincing fake news and spam at scale. The decision split the AI research community — some praised the caution, others dismissed it as a publicity stunt.
Outcome
Short Term
OpenAI adopted a staged release, publishing increasingly large versions over nine months. The feared harms never materialized at scale.
Long Term
The full model was released in November 2019 with little incident. Critics argued the delay was performative since other labs could replicate the work independently. The episode established 'too dangerous to release' as a recurring frame in AI discourse.
Why It's Relevant Today
Mythos is the first frontier model withheld on safety grounds since GPT-2, but the threat is qualitatively different: not generating fake text, but autonomously finding and exploiting real software vulnerabilities. The GPT-2 precedent will shape both the credibility debate and the question of whether restriction actually works when competitors can replicate capabilities.
Stuxnet and state-sponsored zero-day stockpiling (2010)
June 2010
What Happened
The Stuxnet worm, widely attributed to U.S. and Israeli intelligence, used four previously unknown zero-day vulnerabilities to sabotage Iran's nuclear centrifuges. It was the first confirmed case of a cyberweapon causing physical damage to infrastructure, and it revealed that nation-states had been quietly stockpiling zero-day exploits rather than disclosing them to vendors.
Outcome
Short Term
Iran's uranium enrichment program was set back by an estimated two years. The worm escaped its target and spread globally, exposing the technique to the world.
Long Term
Governments formalized vulnerability stockpiling through programs like the U.S. Vulnerabilities Equities Process, which weighs offensive intelligence value against defensive disclosure. The tension between hoarding exploits and patching them became a permanent feature of cybersecurity policy.
Why It's Relevant Today
Project Glasswing faces the same fundamental tension: Anthropic has a tool that can find vulnerabilities at unprecedented scale, but controlling who gets access and ensuring findings go to defenders rather than attackers reprises the stockpile-vs-disclose debate at AI speed.
University of Illinois study on GPT-4 autonomous exploitation (2024)
April 2024
What Happened
Researchers at the University of Illinois demonstrated that GPT-4, given access to vulnerability descriptions, could autonomously exploit 87% of known one-day vulnerabilities at a cost of roughly $8.80 per exploit — far cheaper than hiring a human penetration tester. Without descriptions, the success rate dropped to 7%, suggesting the model relied heavily on existing documentation rather than independent discovery.
Outcome
Short Term
The research drew attention to the dual-use nature of coding-capable AI models but did not trigger any deployment restrictions from OpenAI.
Long Term
The study established a baseline showing AI models were approaching but had not yet reached autonomous vulnerability discovery. It became a reference point for measuring how quickly the threat was advancing.
Why It's Relevant Today
Mythos appears to have crossed the threshold the Illinois researchers identified: not just exploiting known vulnerabilities with descriptions, but autonomously discovering unknown ones. The jump from 87% exploitation of known flaws to autonomous zero-day discovery represents the capability leap that makes Anthropic's restricted release a materially different calculation than GPT-2's.