Mystery Solved: Anthropic reveals that it may have caused changes to Claude's harness and operating instructions

Within weeks, a growing chorus of developers and AI users claimed that Anthropic’s advanced models were losing their edge. GitHub, X, and Reddit users reported is a phenomenon they describe as "AI contraction"— A decline in which Claude is less capable of sustained thought, more prone to hallucinations, and increasingly extravagant with verses.

Critics point to a measurable change in behavior, arguing that the model is a "research – first" approach lazily "edit it" a style that can no longer be relied upon for complex engineering.

The company initially pushed back against its claims "get angry" model to drive demand, growing evidence from high-profile users and third-party benchmarks created a significant confidence gap.

today, Anthropic addressed these concerns directlytechnical post-mortem publication identifying three separate product line changes responsible for the reported quality issues.

"We take reports of degradation very seriously." is reading Anthropic’s blog post about it. "We never intentionally degrade our models, and we were able to immediately confirm that our API and results layer were unaffected."

Anthropic claims that it fixed the issues by changing the justification effort and reverting the verbose request while fixing the caching bug in v2.1.116.

Increasing evidence of degradation

The debate gained momentum in early April 2026 with detailed technical analyzes from the developer community. Stella Laurenzo, CEO of AMD’s AI Group, Published a comprehensive audit of 6,852 Claude Code session files and over 234,000 tool calls on Github. indicates that its performance has decreased from its previous use.

His findings showed that Claude’s depth of reasoning was drastically reduced, resulting in reasoning loops and a tendency to pick on others. "the simplest fix" more than what is right.

This anecdotal disappointment was verified by third-party criteria. BridgeMind reported that Claude Opus 4.6’s accuracy dropped from 83.3% to 68.3% in their tests, causing its ranking to drop from 2nd place to 10th place.

Although some researchers argue that these specific benchmark comparisons are flawed due to inconsistent test coverage, the story of how Claude became such a story "fool" became a viral conversation. Users also reported that usage limits were being exhausted faster than expected, leading to suspicions that Anthropic was deliberately reducing performance to handle increased demand.

The reasons

In a postmortem swamp post, Anthropic clarified that while the base model weights have not dropped, three specific changes "trailer" The surrounding models accidentally hindered their performance:

Default justification effort: On March 4th, Anthropic changed its default justification attempt high for medium Claude Code to fix UI lag issues. This change was intended to prevent the interface from being visible "frozen" model thinking, but this resulted in markedly reduced intelligence for complex tasks.
Caching logic error: Posted on March 26, Caching optimization designed to cut old "thinking" There is a critical error from empty sessions. Instead of clearing the reflection history once after an hour of inactivity, it cleared it on every subsequent turn, causing the model to lose its ability. "short term memory" and becomes repetitive or forgetful.
System Permission Restrictions: On April 16, Anthropic added guidelines to the system prompt to keep text between tool calls under 25 words and final responses under 100 words. Attempting to reduce detail in Opus 4.7 backfired and resulted in a 3% drop in encoding quality rating.

Effect and future guarantees

Quality issues extend beyond the Claude Code CLI and affect them Claude Agent SDK and Claude Kovorkalthough Claude API did not affect.

Anthropic admitted that these changes made the model appear to exist "less intelligence," admitted that it was not the experience users should expect.

To restore user trust and prevent future regressions, Anthropic is making a number of operational changes:

Internal testing: A greater proportion of internal staff will be required to use the exact public structures of Claude Code to ensure that they experience the product as users.
Enhanced Evaluation Packages: The company will now carry out a wider set of evaluations for each model and "ablations" urgent change to isolate the effect of specific instructions for each system.
Tighter controls: New tools have been built to facilitate the auditing of rapid changes, and model-specific changes will be strictly aligned to the intended targets.
Subscriber Compensation: To account for the token waste and performance friction caused by these errors, Anthropic has reset usage limits for all subscribers as of April 23rd.

The company intends to use a new one @ClaudeDevs account on X and GitHub themes to provide a deeper rationale behind future product decisions and maintain a more transparent dialogue with its developer base.

Source link

Mystery Solved: Anthropic reveals that it may have caused changes to Claude’s harness and operating instructions

Increasing evidence of degradation

The reasons

Effect and future guarantees

Leave a ReplyCancel Reply

The FCC told SpaceX and AST SpaceMobile to stay in their lane with satellite spectrum