LLMs Drive Performative Productivity, Question Real Gains

Josh Collinsworth, in a June 5, 2026 blog post titled "LLMs and performative productivity," questions whether the activity that AI coding agents enable is genuinely productive. Collinsworth recounts using agents to onboard into new codebases, refactor a long-delayed Nuxt upgrade in about an hour, scaffold greenfield projects, write tests, and ship many bug fixes. He argues much of this work felt shallow: he lacked deep understanding of the code, could not defend some pull requests, several features went unused, and greenfield projects were abandoned. Collinsworth also reviews the limited research on AI and developer output, noting gains tend to be situational and can shrink, or turn negative, under a holistic view of productivity. He frames the core problem as a mismatch between visible activity and real product impact, warning teams against treating throughput metrics like pull-request velocity as proof of value.
What happened
Josh Collinsworth published a blog post on June 5, 2026 titled "LLMs and performative productivity," describing his firsthand experience using AI coding agents and questioning whether the activity they enable is truly productive. Per Collinsworth, agents sped up onboarding into unfamiliar codebases, a long-delayed Nuxt upgrade finished in about an hour, plus feature work, greenfield scaffolding, test creation, and numerous bug fixes. He reports that many outputs delivered little user-visible value, that he did not build enough context to defend some pull requests, and that several new projects or features were later abandoned.
The argument
Collinsworth contends LLM benefits are highly situational, helping most with boilerplate, greenfield work, and tasks outside the user's expertise, and helping experienced engineers less. He points to the limited body of objective studies on AI and developer output, arguing measured gains are smaller than they feel and can shrink, or go negative, once productivity is viewed holistically rather than by speed or volume alone.
Editorial analysis - industry context
Industry-pattern observations show a recurring tension between activity metrics and impact metrics. Organizations that gauge productivity by commits, pull-request velocity, or test counts may record short-term improvements that do not map to customer usage or system comprehension. When tools do the heavy lifting, higher output can coexist with weaker tacit-knowledge transfer, producing what the post calls a performative impression of productivity.
What to watch
For teams adopting agents, useful signals include changes in code-review behavior, rates of reverted or flaky pull requests, usage telemetry for newly shipped features, and retention of institutional knowledge during onboarding. Dashboards that pair usage data with developer-output measures tend to reveal whether increased activity is translating into product value.
Key Points
- 1Rapid task completion with LLMs can raise visible output without guaranteeing user-facing value or genuine system understanding.
- 2Teams that judge productivity by activity metrics such as pull-request velocity risk mistaking throughput for durable product improvement.
- 3Collinsworth notes existing studies find AI coding gains are situational and can diminish or reverse when productivity is measured holistically rather than by speed alone.
Scoring Rationale
A widely discussed practitioner essay on the AI productivity paradox, directly relevant to engineering managers rethinking output metrics. It is a single-author opinion piece rather than new tooling, data, or research, so it is useful but not high-impact; set at the lower end of the solid band.
Sources
Public references used for this report.
Practice with real Ad Tech data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ad Tech problems