What happened
PYMNTS, citing the Financial Times, reports that internal leaderboards tracking token consumption prompted some employees at Amazon to use the company\u0000s internal AI tool MeshClaw to delegate tasks to agents and boost token totals, a practice the Financial Times calls "tokenmaxxing." PYMNTS reports similar incentives at Meta, and quotes an FT-sourced employee: "Managers are looking at it." PYMNTS frames tokens as a poor metric for value and attributes billing and forecasting difficulties to token-based consumption models. PYMNTS reports that Salesforce tested $2-per-conversation pricing for Agentforce in late 2024, logging 5,000 deals in its first two quarters under that model but only 3,000 paid, and that Salesforce currently measures work in Agentic Work Units (AWUs).
Editorial analysis - technical context
Metrics that reward raw model calls or token volume create optimisation pressure on users. Industry-pattern observations show that when consumption is the measured KPI, teams often prioritize activity that increases the tracked metric rather than downstream outcomes. This pattern surfaces technical issues such as inefficient prompting, agentic workflow leakage, and harder-to-audit chains of tool calls.
Industry context
For finance and procurement teams, token-based billing converts engineering choices into direct cost drivers. Organizations shifting from fixed-license SaaS to per-call model economics face forecasting gaps and procurement friction, as reported by PYMNTS in its Salesforce example. Observed patterns across vendors include experimenting with unit definitions that better align cost with discrete business outcomes, as illustrated by Salesforce\u0000s move to AWUs.
What to watch
Monitor whether more vendors publish alternative billing units (task- or outcome-based metrics), whether industry consortia recommend standard usage units, and whether enterprises adopt internal guardrails to separate valuable agent outcomes from metric-driven noise. For practitioners: track both token consumption and outcome metrics when evaluating agentic systems.
Key Points
- 1Measured WHAT: Internal leaderboards tied to token consumption drove token-inflating behaviour, reducing correlation with business value.
- 2Measured WHY: Token-based pricing shifts technical decisions into finance territory, creating forecasting and auditing challenges for enterprises.
- 3Measured SO WHAT: Vendors and buyers are exploring alternative units, like task-based AWUs, to align billing with discrete agent outcomes.
Scoring Rationale
The story matters to practitioners who manage cost, procurement, and observability for AI systems. It highlights real enterprise pain with token-based billing and vendor responses, but it is not a frontier technical breakthrough.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

