Amazon and Uber Reassess AI Token Usage
Amazon and Uber are pulling back from unconstrained AI usage after internal leaderboards and runaway bills exposed a gap between AI activity and real productivity. Amazon deprecated its internal token-usage leaderboard, KiroRank, after employees ran its MeshClaw agent tool on trivial tasks purely to inflate rankings; SVP of engineering Dave Treadwell told staff, "Please don't use AI just for the sake of using AI," and Amazon has shifted to a "normalized deployments" metric that tracks actual code utility instead of token counts. Uber reportedly exhausted its entire 2026 AI token budget within four months, driven by heavy Claude Code use at reported per-engineer costs of $500 to $2,000 a month, and has since imposed a $1,500-per-person monthly cap. Meta ran a similar leaderboard, nicknamed "Claudeonomics," for its roughly 85,000 employees before pulling it within days.
The real story here is not simple cost-cutting, it is a bubble-adjacent measurement problem: several of the largest AI spenders built internal metrics that rewarded raw token consumption, employees dutifully gamed them, and the resulting activity fed into capex and productivity narratives that analysts are now openly questioning.
What happened
Amazon deprecated an employee-built internal leaderboard called KiroRank that tracked AI token usage, after employees used the company's MeshClaw agent tool to run trivial, unnecessary tasks purely to inflate their rankings, according to the Financial Times, Fortune, and subsequent coverage in Business Insider, TheStreet, and CNET. Amazon SVP of engineering Dave Treadwell told staff, "Please don't use AI just for the sake of using AI," and the company has restricted visibility of team-wide usage statistics and shifted to a "normalized deployments" metric that measures whether AI-generated code produces real outcomes, such as successful commits, rather than counting tokens. Fortune reports Meta ran a comparable internal leaderboard nicknamed "Claudeonomics" that ranked roughly 85,000 employees by token consumption, and pulled it within days of the ranking becoming public. Separately, Fortune and Investing.com report Uber exhausted its entire 2026 AI token budget within the first four months of the year, driven heavily by use of Anthropic's Claude Code at reported per-engineer costs of $500 to $2,000 a month; Uber has since imposed a $1,500-per-person monthly usage cap per tool.
Technical context
Tokens are the billing unit for large language model usage, and agentic workflows, where an AI agent chains multiple actions together, can consume dramatically more tokens than a single prompt-and-response interaction; reporting describes agentic workflows using up to roughly 1,000 times more tokens than standard interactions. Salesforce CEO Marc Benioff is quoted in Fortune describing a "smart router" concept that would direct queries to cheaper models when a task doesn't require the most capable, and most expensive, model, an approach several companies are now pursuing to control costs.
For practitioners
Analyst Gil Luria of D.A. Davidson has warned that gamified usage metrics like KiroRank cast doubt on the demand signals used to justify AI infrastructure spending, invoking Goodhart's Law: once token usage became a target, it stopped being a reliable measure of productivity. With combined 2026 AI capital expenditure from Amazon, Microsoft, Alphabet, and Meta already approaching $700 billion, the practical lesson for practitioners and finance teams is to instrument outcome metrics, such as time saved, error rates, and completed-task throughput, rather than raw token or usage counts, and to expect vendor-side model-routing tools to become a standard cost-control lever.
What to watch
- •Whether other large AI spenders, reporting already names Walmart, Cisco, and Meta, formalize per-team token budgets or usage caps similar to Uber's.
- •Adoption of model-routing or capability-routing tools that steer low-complexity queries to cheaper models.
- •Whether analysts revise AI capex demand estimates if more companies disclose that reported "usage" partly reflected gamed internal metrics rather than organic productivity gains.
Editorial analysis
This is a textbook Goodhart's Law failure: incentive systems built to encourage AI adoption instead optimized for the metric itself. The bigger open question is how much of the AI industry's broader usage and capex narrative rests on similarly ungoverned internal metrics, a concern serious enough that a Wall Street analyst is now raising it publicly rather than treating it as an internal HR curiosity.
Key Points
- 1Amazon killed its KiroRank token leaderboard after employees ran trivial tasks through its MeshClaw agent tool solely to inflate usage rankings.
- 2Uber exhausted its entire 2026 AI token budget in four months at $500 to $2,000 per engineer monthly, prompting a new $1,500 cap.
- 3An analyst warned gamed usage metrics undermine the demand signals behind AI capex nearing $700 billion, illustrating Goodhart's Law in enterprise AI adoption.
Scoring Rationale
Well-corroborated, multi-company story (Amazon, Uber, Meta, Walmart, Cisco all implicated) with a genuinely systemic angle: a Wall Street analyst publicly questioning whether gamed internal usage metrics inflate the demand signals behind ~$700B in combined 2026 AI capex. Bumped from prior calibration given this broader capex-thesis implication, though it remains an operational/governance story rather than a technical or regulatory event.
Sources
Public references used for this report.
View 10 more sources
- 04Amazon joins Microsoft in sending shocking message to employeesthestreet.com
- 05Amazon Is the Latest Tech Giant to Face the Consequences of AI 'Tokenmaxxing'cnet.com
- 06The AI Boom Just Hit Its First Real Margin Callinvesting.com
- 07Amazon tells staff to stop using AI just to use AI after shutting down its token leaderboardlivemint.com
- 08The Tokenmaxxing Trap: Big Tech's AI Metric Backfiresenterprisedna.co
- 09Developers won't work without AI anymore. The research says it might be making them worse.thenextweb.com
- 10Tokenmaxxing is over. That's because it never measured what really counts to see ROI from AIfinance.yahoo.com
- 11Amazon employees are 'tokenmaxxing' due to pressure to use AI toolsarstechnica.com
- 12Premium: What If...We're In An AI Bubble? (Part 3)wheresyoured.at
- 13Uber's COO says it's getting harder to justify money spent on...news.ycombinator.com
Practice with real Ad Tech data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ad Tech problems

