White House demands Anthropic block all jailbreaks on 2 models in impossible ask

The Trump administration is demanding Anthropic block all possible jailbreaks of its most advanced AI models — a technically impossible request that has brought the two sides to an impasse over the future of AI regulation.

The White House is demanding Anthropic eliminate all possible security vulnerabilities in its Fable 5 and Mythos 5 models, a technically unattainable standard that has escalated into a standoff over the future of AI regulation, according to a senior White House official and an administration official familiar with the matter.

"The problem here is that the White House has been in this extreme anti-regulatory posture, and they're now faced with the real AI capabilities that people have been predicting for many years," a former White House technology official, who asked to remain anonymous to avoid jeopardizing professional relationships, said. "There should have been preparation and policies to systematically deal with this, managing the benefits and risks, but instead it's just this slap-dash approach that puts the AI industry in a real quandary."

The dispute erupted after the White House imposed export controls on Anthropic on June 13, forcing the company to suspend access to both models for all users. Amazon CEO Andy Jassy had warned Treasury Secretary Scott Bessent that researchers found evidence of guardrail bypasses. Anthropic argued the vulnerability was limited and did not amount to a meaningful security flaw, but the administration responded by barring foreign users from accessing the models. The company opted to pull the models entirely, claiming it was the only way to comply with the export controls.

The standoff carries significant economic stakes. Anthropic's enterprise customers — including Apple, Meta, and much of the Fortune 500 — remain locked out of the company's most advanced systems. The dispute has also frozen the company's ability to deploy new models, potentially slowing its revenue growth and competitive positioning against rivals OpenAI and Google DeepMind.

The technical impossibility at the heart of the dispute

Security researchers and AI executives say the White House's demand cannot be met under current technology. Because large language models are probabilistic rather than deterministic, companies cannot guarantee what they will generate in response to any given prompt. Every model can be jailbroken to varying extents, and solving the problem entirely is not feasible with existing methods.

Anthropic and independent cybersecurity researchers argue that jailbreaks are not an isolated issue that can be patched away. The company's initial defense was that no AI model can be completely immune to hacking — a position that irked White House officials who noted Anthropic had spent years warning of potential AI catastrophe.

The negotiations between the White House and Anthropic — led on the company's side by Sarah Heck, head of public policy, and co-founder Tom Brown — have shifted toward developing a common set of benchmarks to assess future jailbreaks, including the extent to which safeguards were bypassed, the capabilities exposed, and the practical consequences of the breach. While the export controls have yet to be lifted, the shift toward a technical standards-setting exercise signals that negotiations are progressing.

A de facto licensing regime emerges

The Trump administration had previously opposed mandatory AI licensing. President Trump signed an executive order last month creating a "voluntary" system for AI labs to submit models for early government testing, with a carve-out explicitly stating it would not become mandatory. But the Anthropic dispute has effectively created an ad hoc version of such a regime.

Other leading AI labs — including OpenAI, Google, and Meta — have been watching the dispute closely. Many AI leaders now believe they will need to give the White House early access to their latest models and be extremely proactive about sharing information about upcoming launches. The risk of catching officials off guard, they say, is simply too great.

"Advance notice, advance access. I think those are the primary asks that we've heard, not just from the US, but others around the world," Aidan Gomez, CEO of the Canadian AI lab Cohere, said in an interview earlier this week. "I think those are good things in many respects. It shows strong engagement and consideration by authorities on a super important technology."

The dispute also emerged as a key topic at the G7 Summit in France this week, where President Trump said talks with Anthropic were "going fine" but did not provide details. Anthropic CEO Dario Amodei urged world leaders to resist the temptation to splinter in their approaches to AI regulation.

This article is for informational purposes only and does not constitute investment advice.