Funding & Businesstokenmaxxingcost controlsamazonuber

Amazon and Uber Reassess AI Token Usage

|June 1, 2026|By LDS Team

7.3

Relevance Score

Amazon and Uber Reassess AI Token Usage — Photo: i.insider.com · rights & takedowns

Amazon and Uber are pulling back from unconstrained AI usage after internal leaderboards and runaway bills exposed a gap between AI activity and real productivity. Amazon deprecated its internal token-usage leaderboard, KiroRank, after employees ran its MeshClaw agent tool on trivial tasks purely to inflate rankings; SVP of engineering Dave Treadwell told staff, "Please don't use AI just for the sake of using AI," and Amazon has shifted to a "normalized deployments" metric that tracks actual code utility instead of token counts. Uber reportedly exhausted its entire 2026 AI token budget within four months, driven by heavy Claude Code use at reported per-engineer costs of $500 to $2,000 a month, and has since imposed a $1,500-per-person monthly cap. Meta ran a similar leaderboard, nicknamed "Claudeonomics," for its roughly 85,000 employees before pulling it within days.

The real story here is not simple cost-cutting, it is a bubble-adjacent measurement problem: several of the largest AI spenders built internal metrics that rewarded raw token consumption, employees dutifully gamed them, and the resulting activity fed into capex and productivity narratives that analysts are now openly questioning.

What happened

Amazon deprecated an employee-built internal leaderboard called KiroRank that tracked AI token usage, after employees used the company's MeshClaw agent tool to run trivial, unnecessary tasks purely to inflate their rankings, according to the Financial Times, Fortune, and subsequent coverage in Business Insider, TheStreet, and CNET. Amazon SVP of engineering Dave Treadwell told staff, "Please don't use AI just for the sake of using AI," and the company has restricted visibility of team-wide usage statistics and shifted to a "normalized deployments" metric that measures whether AI-generated code produces real outcomes, such as successful commits, rather than counting tokens. Fortune reports Meta ran a comparable internal leaderboard nicknamed "Claudeonomics" that ranked roughly 85,000 employees by token consumption, and pulled it within days of the ranking becoming public. Separately, Fortune and Investing.com report Uber exhausted its entire 2026 AI token budget within the first four months of the year, driven heavily by use of Anthropic's Claude Code at reported per-engineer costs of $500 to $2,000 a month; Uber has since imposed a $1,500-per-person monthly usage cap per tool.

Technical context

Tokens are the billing unit for large language model usage, and agentic workflows, where an AI agent chains multiple actions together, can consume dramatically more tokens than a single prompt-and-response interaction; reporting describes agentic workflows using up to roughly 1,000 times more tokens than standard interactions. Salesforce CEO Marc Benioff is quoted in Fortune describing a "smart router" concept that would direct queries to cheaper models when a task doesn't require the most capable, and most expensive, model, an approach several companies are now pursuing to control costs.

For practitioners

Analyst Gil Luria of D.A. Davidson has warned that gamified usage metrics like KiroRank cast doubt on the demand signals used to justify AI infrastructure spending, invoking Goodhart's Law: once token usage became a target, it stopped being a reliable measure of productivity. With combined 2026 AI capital expenditure from Amazon, Microsoft, Alphabet, and Meta already approaching $700 billion, the practical lesson for practitioners and finance teams is to instrument outcome metrics, such as time saved, error rates, and completed-task throughput, rather than raw token or usage counts, and to expect vendor-side model-routing tools to become a standard cost-control lever.

What to watch

•Whether other large AI spenders, reporting already names Walmart, Cisco, and Meta, formalize per-team token budgets or usage caps similar to Uber's.
•Adoption of model-routing or capability-routing tools that steer low-complexity queries to cheaper models.
•Whether analysts revise AI capex demand estimates if more companies disclose that reported "usage" partly reflected gamed internal metrics rather than organic productivity gains.

Editorial analysis

This is a textbook Goodhart's Law failure: incentive systems built to encourage AI adoption instead optimized for the metric itself. The bigger open question is how much of the AI industry's broader usage and capex narrative rests on similarly ungoverned internal metrics, a concern serious enough that a Wall Street analyst is now raising it publicly rather than treating it as an internal HR curiosity.

Key Points

1Amazon killed its KiroRank token leaderboard after employees ran trivial tasks through its MeshClaw agent tool solely to inflate usage rankings.
2Uber exhausted its entire 2026 AI token budget in four months at $500 to $2,000 per engineer monthly, prompting a new $1,500 cap.
3An analyst warned gamed usage metrics undermine the demand signals behind AI capex nearing $700 billion, illustrating Goodhart's Law in enterprise AI adoption.

Scoring Rationale

Well-corroborated, multi-company story (Amazon, Uber, Meta, Walmart, Cisco all implicated) with a genuinely systemic angle: a Wall Street analyst publicly questioning whether gamed internal usage metrics inflate the demand signals behind ~$700B in combined 2026 AI capex. Bumped from prior calibration given this broader capex-thesis implication, though it remains an operational/governance story rather than a technical or regulatory event.

MoreEnterprise AI news

Sources

Public references used for this report.

13 sources

fortune.com'That doesn't sound very healthy': Amazon's reported tokenmaxxing might gamify AI usage, analyst warns

businessinsider.comToken Reckoning: Amazon and Uber Reassess AI Investments

wsj.comCorporate America Is Starting to Ration AI as Cost Skyrockets

View 10 more sources

Practice with real Ad Tech data

90 SQL & Python problems · 15 industry datasets

Used by DS/ML engineers at top companies

Active Search Campaigns by BudgetEasy

High CPC Clicks & Poor Landing PagesMedium

Campaign ROAS by Attribution ModelHard

250 free problems · No credit card

See all Ad Tech problems