Anthropic warns Claude AI is building itself faster than expected, calls for option to halt frontier development —’recursive self improvement’ increases risk hu

Anthropic warns Claude AI is building itself faster than expected, calls for option to halt frontier development —'recursive self improvement' increases risk hu

The company set out three pretty dire ways the next few years could unfold, reserving its most severe warnings for the scenario in which models become capable of fully improving themselves. In that case, Anthropic said, the pace of progress would be set almost entirely by available compute, with humans pushed toward oversight and verification roles and a self-improving model dominating as its abilities outstrip those of the people who built it.

The firm described this potential issue with alignment and the task of keeping a system's behavior tied to human intent as part of the future it’s least sure about. Misalignment that’s rare and survivable today could compound generation over generation until control slips, it said, though it allowed that a sufficiently capable and well-aligned model might instead choose to halt its own development. Anthropic wrote that this misalignment could keep "growing more frequent but less understood until we lose control of them."

Anthopric is backing these warnings with a bunch of internal figures that we’ve not seen before. More than 80% of the code merged into its production codebase as of last month was authored by Claude, up from low single digits before Claude Code reached research preview in February last year. Anthropic says the typical engineer is now “merging 8x as much code per quarter as they did from 2021-2025.”

On the hardest, least-specified coding tasks, Anthropic said Claude succeeded 76% of the time in May 2026, a rise of 50 percentage points in six months. A recurring internal test that asks each new model to make training code run faster saw results climb from roughly triple the original speed with Claude Opus 4 in May 2025 to about 52 times with the unreleased Mythos Preview model in April.

Anthropic's latest AI model identifies 'thousands of zero-day vulnerabilities' in 'every major operating system and every major web browser'

Anthropic's Claude Mythos might be the best overall AI model for cybersecurity but cheaper models can attain similar results, research shows

Anthropic said it’d slow or pause only if rival labs at or near the frontier did the same in a verifiable way, and that a halt by one company would change who leads without achieving anything wider. That’s obviously not going to happen .

All the figures cited by Anthropic are self-reported and unaudited, and come days after the company filed to go public . The company issued a similar self-assessment in April, when it said Mythos Preview had found thousands of severe software vulnerabilities, a claim that later drew scrutiny over how much of it rested on a small manual sample.

Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

Key considerations

Investor positioning can change fast
Volatility remains possible near catalysts
Macro rates and liquidity can dominate flows

Reference reading

More on this site

Informational only. No financial advice. Do your own research.

Key considerations

Reference reading

More on this site

Related posts:

Leave a Comment Cancel reply