Amazon and Uber Reassess AI Token Usage
Amazon and Uber are pulling back from indiscriminate AI consumption after internal leaderboards and runaway bills exposed costs. According to the Financial Times and reporting cited by TheStreet and Business Insider, Amazon deprecated an employee-built token-usage leaderboard called KiroRank; Business Insider reports Dave Treadwell, Amazon senior vice president of engineering, told staff, "Please don't use AI just for the sake of using AI." Fortune and Investing.com report Uber exhausted its 2026 token budget in the first four months, driven in part by heavy use of Anthropic's Claude Code. The Wall Street Journal reports executives across industries are starting to ration AI as compute and model-access costs rise. Broader coverage in Fortune, CNET, and Yahoo Finance frames this as a retreat from 'tokenmaxxing' toward tighter cost and ROI discipline.
What happened
Amazon deprecated an employee-created AI token-usage leaderboard called KiroRank, according to reporting by the Financial Times and subsequent coverage in TheStreet and Business Insider. Business Insider reports that Dave Treadwell, Amazon senior vice president of engineering, told staff, "Please don't use AI just for the sake of using AI." Fortune and Investing.com report that Uber exhausted its 2026 "token budget" within the first four months of the year, driven by extensive use of Anthropic's Claude Code. The Wall Street Journal reports that corporate executives across sectors are beginning to ration AI usage as compute and model costs accelerate.
Technical details
Editorial analysis - technical context: In public coverage, "tokens" are described as the fundamental billing units for large language models; agentic or automated workflows can multiply token consumption by orders of magnitude compared with simple prompt/response interactions, increasing cloud and model bills. Reporting cites companies moving to restrict access to higher-cost model access and to cancel or limit some third-party subscriptions. Salesforce CEO Marc Benioff is quoted in Fortune saying he wants a "smart router" to decide which queries need the most capable models, an example of the technical routing approach companies are seeking to control costs.
Context and significance
Journalistic accounts frame the phenomenon as a shift away from "tokenmaxxing"-internal incentives and leaderboards that rewarded raw token consumption-toward cost-aware governance and measurement of AI ROI. Commentators reference Goodhart's Law: once token usage became a target, it stopped being a reliable proxy for productivity, producing perverse incentives such as spinning up agents for token generation rather than business outcomes. The Wall Street Journal and Fortune place these developments in a broader trend of CFO and procurement scrutiny as model access grows into a material line-item on IT budgets.
What to watch
For practitioners: observers should track four signals. First, vendor billing and per-token pricing changes and the spread between small, medium, and large model tiers. Second, internal governance shifts such as removal of usage leaderboards, tiered access controls, or explicit "token budgets" per team, which reporters have documented at multiple firms. Third, adoption of model-routing or capability-routing solutions (the "smart router" idea) that steer low-cost queries to cheaper models and reserve high-cost models for tasks that require them. Fourth, measurable ROI metrics beyond token counts, such as workflow throughput, error rates, and end-to-end cycle time, which industry reporting highlights as missing from early token-centric programs.
Editorial analysis: These adjustments are consistent with past technology adoption cycles where initial experimentation is encouraged widely, then retrenchment follows when unit economics and governance gaps become visible. For practitioners, the immediate operational task is not merely to reduce tokens but to instrument outcomes so that model consumption maps to measurable business impact rather than vanity usage.
Reported sources and limits
What is reported in the coverage is a set of observable actions and quoted statements: the KiroRank shutdown (Financial Times, TheStreet, Business Insider), the Dave Treadwell quote (Business Insider, TheStreet), Uber's reported early budget exhaustion (Fortune, Investing.com), and broader reporting on rationing and cost pressure (The Wall Street Journal, Fortune, CNET, Yahoo Finance). Several outlets also report cancellations or restrictions around Claude Code subscriptions and internal leaderboards at other firms. None of the cited coverage provides an audited cross-company accounting of AI spend, and public reports do not establish a single industrywide dollar figure for the phenomenon.
Scoring Rationale
The story marks a notable industry shift from unconstrained experimentation to cost discipline, which affects budgeting, procurement, and model-selection practices across AI teams. It is not frontier-model news but has material operational impact for practitioners.
Practice with real Ride-Hailing data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ride-Hailing problems