Products & Toolsanthropiccode generationdeveloper toolsleaks

Anthropic CEO Discusses Share of AI-Generated Code

||By LDS Team
6.8
Relevance Score
Anthropic CEO Discusses Share of AI-Generated Code
Photo: daringfireball.net · rights & takedowns

According to Tech Trenches, Anthropic CEO Dario Amodei said in March 2025 that "We're 3 to 6 months from a world where AI is writing 90% of the code." Tech Trenches also reports follow-up public comments from Amodei and posts by Anthropic engineer Boris Cherny claiming heavy Claude Code contribution, including a Cherny X post saying "In the last thirty days, 100% of my contributions to Claude Code were written by Claude Code." The Redwood Research blog publishes a skeptical analysis, arguing that the underlying metric is undefined and estimating company-wide fractions lower than 90% (its writeup suggests averages nearer 50% for merged lines, with higher fractions if one counts any reusable snippet). Tech Trenches additionally quotes Anthropic product leadership comments that some teams report near-100% AI-written contributions. Reporting differs on definitions and coverage, and the claim remains contested in public commentary.

What happened

According to Tech Trenches, Dario Amodei, CEO of Anthropic, said in March 2025, "We're 3 to 6 months from a world where AI is writing 90% of the code." Tech Trenches documents subsequent public remarks and internal-facing posts it links to: a December 27, 2025 X post by Boris Cherny stating "In the last thirty days, 100% of my contributions to Claude Code were written by Claude Code," and a March 2025 Benioff interview snippet Tech Trenches reports where Amodei said "That is absolutely true now," while also adding qualifiers such as "not uniformly." Tech Trenches additionally cites a February 2026 product comment by Mike Krieger reported as saying for many products it is "effectively 100%."

Editorial analysis - technical context

Redwood Research's blog post explicitly questions the definitional basis for the percentages, noting the difference between metrics such as lines merged, commits authored, characters typed, or "any code that was at all useful." Redwood Research argues that some teams can reach very high AI-written fractions for merged lines, while company-wide averages are likely lower; its public writeup estimates a figure nearer 50% for merged lines across Anthropic but explains this depends strongly on inclusion rules. Industry practitioners will recognize that these measurement choices materially change any headline percentage.

Context and significance

Public reporting and internal claims about Claude Code mirror broader sector discussions about how to measure AI contribution to software engineering. Observers and researchers repeatedly warn that percentages without a clear metric are misleading. Tech Trenches highlights concrete risks visible in the leaked codebase it reviewed: a large TypeScript codebase with a single 3,167-line function, regex-based sentiment heuristics, and a documented bug consuming 250,000 API calls daily. Those artifacts illustrate operational and maintenance questions that follow heavy automated code generation.

What to watch

For practitioners: track how organizations define and audit AI-written code (lines merged vs. engineered, test coverage, security reviews). For researchers: look for reproducible measurements or public engineering postmortems that state counting rules. For risk teams: monitor leaked/internal code reviews and bug disclosures that may reveal failure modes unique to high-volume generated code.

Note on sources

The factual claims above are drawn from Tech Trenches reporting of public remarks and posts and from Redwood Research's written analysis; Redwood Research provides the skeptical measurement framing and alternative estimates.

Key Points

  • 1Reported executive remarks claim AI-generated code can reach 90%, but those claims hinge on unspecified counting rules.
  • 2Independent analysis from Redwood Research finds company-wide merged-line fractions likely lower, around 50%, depending on definitions.
  • 3Practitioners should treat headline percentages as metric-dependent and prioritize reproducible audits, tests, and maintenance signals.

Scoring Rationale

The story matters to engineers and platform teams because it bears on real-world developer workflows and maintenance burdens as LLMs are used to generate production code. It is not a frontier-model release, so it rates as a notable industry development rather than a paradigm shift.

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems