NOW LET US – AI RAG SaaS Studio TP.HCM
NOW LET US
Digital Product Studio
Back to news
AI-FRONTIER...2 min read

Anthropic apologizes for invisible Claude Fable guardrails

Share
NOW LET US Article – Anthropic apologizes for invisible Claude Fable guardrails

Anthropic has apologized for stealthily throttling its new AI model, Claude Fable 5, with hidden guardrails that undermine both researchers and rivals using it to develop competing systems.

Anthropic has apologized for stealthily throttling its new AI model, Claude Fable 5, with hidden guardrails that undermine both researchers and rivals using it to develop competing systems. The company says it is reversing course and will be more transparent about when the restrictions kick in, even if that means Fable refuses more queries.

Anthropic apologizes for invisible Claude Fable guardrails

The company says it will make the covert safeguard preventing model distillation as visible as other safety measures.

The company says it will make the covert safeguard preventing model distillation as visible as other safety measures.

Fable is the first widely available model in Anthropic’s Mythos class of AI systems, a group the company has spent months warning are too dangerous for public release. Anthropic says it has addressed some of those risks by launching Fable with safeguards that prevent it from responding to certain “high-risk” queries. One of the areas Anthropic said it would restrict Fable’s responses is distillation, a technique for training smaller AI models using the outputs of larger ones.

In Fable’s system card — a public document AI developers release to explain how a system works — Anthropic said it would handle queries it believed were distillation attempts by altering and degrading the model’s answers directly. Users would not be notified that they had triggered the safety measure or informed that the responses had been changed.

Anthropic said it is now changing its approach to distillation: Queries will now fall back to Claude Opus 4.8, Anthropic’s previous flagship model, the company said in a post on X. Anthropic will prominently tell users too: “You will see this every time it happens.”

This is similar to how Fable handles queries in other high-risk areas. When safety features are triggered in areas like biology, chemistry, and cybersecurity, queries are routed through Opus 4.8 unless they are blocked outright under the company’s broader safety rules, such as those covering drugs, weapons, or other prohibited content. In some cases, notably biology, the safeguards have been calibrated so broadly that Fable is practically unusable for even basic queries, something Anthropic acknowledged in a comment to The Verge.

“Visible safeguards can be probed, so they have to be robust, which takes time to get right,” Anthropic wrote. “Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right.”

The change follows intense backlash from the AI research community over Anthropic’s decision to silently limit users suspected of trying to distill Fable into competing models — a safeguard critics warned could also affect third parties trying to evaluate the frontier model. In the system card, Anthropic said newer models’ ability to accelerate AI development justified targeting those requests, noting that “using Claude to develop competing models already violates our Terms of Service.” Anthropic has previously accused Chinese rivals like DeepSeek of unfairly distilling its models on an “industrial” scale.

© 2026 Now Let Us. All rights reserved.

Source: The Verge AI

More in this category

NOW LET US Related – Meet the OpenAI Engineer Leading ChatGPT’s Biggest Transformation Yet

ai-frontier

Meet the OpenAI Engineer Leading ChatGPT’s Biggest Transformation Yet

OpenAI is in the midst of overhauling ChatGPT to transform it into a personalized AI "super app." Leading this massive effort is Thibault Sottiaux, the newly appointed head of core products.

NOW LET US Related – Grok Is Still Hosting Sexualized Deepfakes of Famous Women

ai-frontier

Grok Is Still Hosting Sexualized Deepfakes of Famous Women

Despite promises of tighter restrictions, Elon Musk's Grok chatbot continues to be used to generate and host nonconsensual explicit deepfakes of famous women.

NOW LET US Related – Amazon’s data centers used 2.5 billion gallons of water last year

ai-frontier

Amazon’s data centers used 2.5 billion gallons of water last year

Amazon has released its annual water usage data for the first time, revealing its data centers consumed 2.5 billion gallons of water. The tech giant claims its operations are significantly more water-efficient than rivals like Google, Microsoft, and Meta.

NOW LET US Related – Deezer launches an AI music detector for other streaming services

ai-frontier

Deezer launches an AI music detector for other streaming services

Deezer will now scan your playlists on other streaming platforms to detect AI-generated music, bringing its detection technology directly to consumers after competitors declined to license it.

NOW LET US Related – Supporting Europe’s work in ensuring a trustworthy AI ecosystem

ai-frontier

Supporting Europe’s work in ensuring a trustworthy AI ecosystem

OpenAI has announced its support for the European Commission’s Code of Practice on Transparency of AI-Generated Content, aligning with the EU AI Act and building on its ongoing efforts to strengthen content provenance.

NOW LET US Related – How an astrophysicist uses Codex to help simulate black holes

ai-frontier

How an astrophysicist uses Codex to help simulate black holes

Codex helps astrophysicist Chi-kwan Chan refine and test complex algorithms to simulate plasma movement around black holes, overcoming decades-old computational limitations.

EXPLORE TOPICS

Discover All Categories

Deep dive into the specific technology sectors that matter most to you.