Artificial intelligence models that are claimed to be terrifyingly powerful have scared the government too much and now they are disabled



according to the statement was posted on its website on FridayAnthropic was forced to “suddenly shut down” two of its most valuable frontier AI models in response to a highly restrictive order from the US government. “We believe this was a misunderstanding and are working to restore access as soon as possible,” the statement said.

The government’s move in question is an “export control directive” that says foreign nationals can’t use the models inside or outside the U.S., and was motivated by what Anthropic said was an unspecified national security concern.

But national security concerns and other safety and security fears were central to the proliferation of these models, which made such an event unexpected.

Instead of releasing the Claude Mythos Preview model to the public, Anthropic in early April turned the creation of the model into a kind of awareness-raising campaign about the perceived dangers of frontier AI models.

This released the system card explains why the model has not been released to the public and details alarming possibilities such as spoofing and the ability to breach protection from a limited system. It is also claimed that it can help in the development of advanced weapons. For example, the system card described it as having “significant cross-domain synthesis capabilities associated with the development of catastrophic biological weapons.”

At the same time, the company introduced Project Glasswing, which allowed a limited group of partners and organizations to sample the model to see what new horrors it could bring to the world of cybersecurity. “We created Project Glasswing because of the opportunities we see in the new frontier model taught by Anthropic that we believe can reshape cybersecurity.” Anthropic blog post about Project Glasswing he says.

Soon, the Mythos Preview became a tabloid story, despite the inherent triviality of the subject matter. An article in the New York Post quoting computer scientist Roman Yampolsky, as Mythos heralds, AI may soon develop “hacking tools, biological weapons, chemical weapons, and (and) new weapons we can’t even imagine.” The phrase “weapons beyond imagination” even became a headline.

British government officials and UK financial sector leaders came together to create a plan of action given the perceived threat. According to the New York TimesThe Trump Administration’s “non-interventionist policy” toward artificial intelligence changed after Mythos’ announcement, and its mere existence prompted an executive order on security-focused artificial intelligence. Trump signed such an order about a week ago.

Nevertheless, last week Anthropic Claude released Fable 5 and Mythos 5. The company described The Fable 5 is a “Myth-class model that we’ve made safe for general use,” but has capabilities that “exceed the capabilities of any model we’ve released for general use.” Mythos 5 received a very limited release as part of Project Glasswing.

Brian Merchant in Blood in the Machine described it as follows:

After causing a huge news cycle in the tech media in April when he announced that he was building an artificial intelligence model called Mythos—so powerful, so dangerous, that it threatened to overthrow the entire order of civilization, and he kept the product strictly hidden from the public to protect us from it—the country’s now #1 AI startup decided to pitch Mythos. then it is put up for sale.

Hours after Merchant wrote these words, an export control directive was delivered to Anthropic and Fable 5 and Mythos 5 were made unavailable due to apparent national security concerns. It appears that Anthropic has only been ordered to revoke access from non-US users, but it’s understandable that Anthropic would find it impossible to allow anyone from anywhere in the world to access them for fear of disobeying the order. Among many issues, non-US citizens work at Anthropic. It is simpler to draw the models completely until the situation is resolved.

Interestingly, Anthropic’s statement on the export control directive noted that Anthropic “worked with the US government” along with the UK government and “numerous private third-party organizations” to establish a satisfactory security package for the models. Guarantees after release in many ways the most prominent feature of the media narrative Tale 5 around. One of the tougher safeguards, designed to quietly punish users who abuse the model, was even considered ill-conceived. It prompts an apology from Anthropic.

But according to Anthropic, the government became alarmed after learning about a jailbreak that bypassed these important safeguards for Fable 5:

“Our understanding is that the government believes it is aware of a method of bypassing or ‘jailbreaking’ Fable 5. We have previously reviewed demonstrations of this particular technique used to identify small known vulnerabilities. All of these vulnerabilities appear to be relatively simple, and we have found that other publicly available models can be detected without revealing them.

Anthropic quite rightly points out that when it released Fable 5, section in its own blog post revealed that some jailbreaks are still possible due to the security of the model. This is probably not possible absolutely prevents universal jailbreaks, but our goal is to make the rest of the jailbreaks slow and costly enough that we can detect and prevent them before they can be used at scale,” Anthropic writes. In fact, since it’s not yet possible for the model to be perfectly jailbreak-resistant, Anthropic tried to present jailbreaks as either expensive to produce or a “public danger” too narrowly. Mythos-class models retain more user data than ever before.

Still, it’s odd that Anthropic is now downplaying the threats its models have detected, writing that the vulnerabilities are “minor,” “previously known,” and “relatively simple,” noting that “other publicly available models were able to discover them without requiring a port.”

Still, when Anthropic first introduced this class of models to the public, it said it created an unprecedented power that could do real damage to the world. Two months later, the “Mythos-class” model was a product for public consumption and was introduced as a premium product for a limited time at no additional cost to users of the “Pro, Max, Team and seat-based Enterprise plans”. As of June 23, Anthropic’s intention was to “remove Fable 5 from these plans” and instead require a pay-as-you-go plan.

Anthropic argues that such government action, if it becomes standard, “could halt all new model deployments for all cross-border model providers.” And maybe that’s true. For a product launch related to a cutting-edge piece of technology deemed worthy of a global cybersecurity reassessment, even if that overreaction is bad for business, an overreaction to gaps in that product’s safeguards should probably come as no surprise.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *