NOW LET US – AI RAG SaaS Studio TP.HCM
NOW LET US
Digital Product Studio
Back to news
AI-FRONTIER...3 min read

Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude

Share
NOW LET US Article – Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude

Anthropic is backtracking on a controversial policy that covertly degraded the performance of its Claude Fable 5 model for AI researchers. The decision comes after intense backlash from the tech community, which accused the company of trying to stifle competition.

Anthropic is backtracking on a policy that would have covertly limited competitors from using its new AI model, Claude Fable 5, to develop other AI models. The company changed course after the move received significant backlash from the AI research community.

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible,” Anthropic said in a statement to WIRED. “We made the wrong trade-off and we apologize for not getting the balance right.”

Anthropic released Claude Fable 5, a version of its latest AI model with additional safety guardrails designed to prevent misuse, earlier this week. Some of the safeguards Anthropic decided on were unsurprising: The company said it would reroute users who asked questions about cybersecurity, biology, or chemistry to a less capable AI model to reduce the chances of someone using the advanced AI to carry out a cyberattack or build a bioweapon.

But for researchers trying to use Claude Fable 5 for frontier AI development, Anthropic outlined a different approach. The firm would deliberately degrade the model’s performance in ways that were invisible to the user. The move would effectively sabotage researchers trying to use Claude to train competing AI models, which Anthropic explicitly bans in its terms of service.

Anthropic now says it’s changing course, and that Claude Fable 5’s safeguards for AI development will be visible to users. If the company suspects a user is trying to use Claude to build a highly capable AI, it will alert them that it’s either refusing the request or rerouting the user to a less capable model.

Anthropic reversed the policy after it received fierce backlash from the AI research community. Anthropic has already taken steps to limit competitors from using Claude to build closed- and open-source AI models, but critics say that quietly degrading the model’s performance for certain users went a step too far. Claude’s coding agent has become a favored tool among developers, including those working on open-source AI research projects, and researchers tell WIRED that the company’s latest policy could have led to a troubling future in which only a handful of leading AI labs could perform advanced AI research.

Dean Ball, a senior fellow at the Foundation for American Innovation and a former adviser to the White House on AI, wrote in a post on X that “degrading performance on ML research without telling the user is shockingly hostile and a terrible look.” He continued in another post that the “secret sabotage” policy undermines Anthropic’s overall stance, because it limits AI researchers from collaborating on AI safety.

“It felt like Anthropic was saying to the public, ‘We don't trust anybody else to do AI research. We are the only ones who have to do AI research,” says Will Brown, research lead at the open-source AI startup Prime Intellect. “It feels a bit like they’re starting to pull the ladder up behind them.”

Brown said the policy would also have left developers in the dark about whether they were violating Anthropic’s rules, since the company wouldn’t alert them when its safeguards were triggered. He added that the restrictions could have had widespread consequences. For example, he pointed to the growing ecosystem of third-party evaluation firms that test frontier models for safety, performance, and reliability—work that could have been hindered if Anthropic secretly degraded its model.

Anthropic said it implemented the measures because Claude has become increasingly effective at accelerating AI research. In a recent blog post, the company said it is concerned that AI could improve its capabilities faster than society can adapt to them. Anthropic argued that it would be “good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up.”

“These safeguards prevent foreign adversaries from using our most capable models in ways that pose severe safety risks. The US and its allies hold an edge in frontier chips and the highly optimized software that runs them at full potential,” the company said in a statement to WIRED. “These safeguards ensure Claude isn't used to erode that advantage—by optimizing chips developed by those adversaries, for example … In deciding whether to make them visible or invisible we faced a choice. A hidden safeguard is harder to probe and work around. This means the safeguards can be targeted much more narrowly.”

Anthropic says that because this safeguard around AI development is now visible, it needs to cast a wider net, meaning more benign requests may trigger its safeguards. The company says it’s working to make its classifiers more precise as quickly as possible.

© 2026 Now Let Us. All rights reserved.

Source: Wired AI

Advertisement
Ad slot ready: 5887729102

More in this category

NOW LET US Related – Siri won’t be your AI girlfriend

ai-frontier

Siri won’t be your AI girlfriend

Apple's software chief Craig Federighi revealed that the redesigned Siri is programmed to avoid sycophantic behavior and will not act as a romantic partner or 'AI girlfriend' for users.

NOW LET US Related – Apple’s Camera Chief Thinks AI Can Give You Superpowers

ai-frontier

Apple’s Camera Chief Thinks AI Can Give You Superpowers

As generative AI blurs the line between real and fake photos, Apple's camera chief Jon McCormack explains how the company is taking a measured approach with iOS 27, using AI to solve compositional issues while preserving the authenticity of personal memories.

NOW LET US Related – Why You Might Already Own SpaceX Shares, Siri’s AI Makeover, and Knicks Owner’s Surveillance Machine

ai-frontier

Why You Might Already Own SpaceX Shares, Siri’s AI Makeover, and Knicks Owner’s Surveillance Machine

This week's tech roundup covers Apple's partnership with Google to revive Siri, SpaceX's historic upcoming IPO, and a shocking investigation into Madison Square Garden's surveillance system.

NOW LET US Related – Meet the OpenAI Engineer Leading ChatGPT’s Biggest Transformation Yet

ai-frontier

Meet the OpenAI Engineer Leading ChatGPT’s Biggest Transformation Yet

OpenAI is in the midst of overhauling ChatGPT to transform it into a personalized AI "super app." Leading this massive effort is Thibault Sottiaux, the newly appointed head of core products.

NOW LET US Related – Our new community investments in Virginia support local jobs and expand energy affordability.

ai-frontier

Our new community investments in Virginia support local jobs and expand energy affordability.

Google is deepening its commitment to Virginia with new community investments aimed at supporting local jobs, training the next-generation workforce, and launching a $15 million Energy Impact Fund to improve energy affordability.

NOW LET US Related – Grok Is Still Hosting Sexualized Deepfakes of Famous Women

ai-frontier

Grok Is Still Hosting Sexualized Deepfakes of Famous Women

Despite promises of tighter restrictions, Elon Musk's Grok chatbot continues to be used to generate and host nonconsensual explicit deepfakes of famous women.

EXPLORE TOPICS

Discover All Categories

Deep dive into the specific technology sectors that matter most to you.