Cheaper Tokens Drive Higher AI Token Spending

The Silicon Data Token Expenditure Index, which tracks aggregate spending on large language model usage, roughly doubled since late 2025 while the price per token fell about 90% since 2023, according to a June 12 presentation by Torsten Slok at Apollo Global Management (Apollo). Reporting reproduced by Edward Conard and others cites the same index. Analysts and commentators frame the pattern as an instance of Jevons paradox: lower unit costs prompt much higher aggregate consumption. Industry coverage also highlights related signs in the inference economy, including larger share of compute devoted to inference and high bills at some corporations, as reported by BusinessEngineer and Technology.org.
What happened
Per a June 12 slide deck by Torsten Slok at Apollo Global Management, the Silicon Data Token Expenditure Index has roughly doubled since late 2025, even as the price of a single token has fallen by about 90% since 2023. Edward Conard's summary and other outlets reproduce the same index finding. Technology.org and other trade coverage report anecdotal examples of runaway bills, including claims that some companies spent unusually large portions of their 2026 AI budgets on inference usage.
Technical Context
Industry reporting frames this as an instance of Jevons paradox, a classical economics observation that improvements in efficiency or lower unit costs can increase total consumption. Business-oriented coverage, including a BusinessEngineer roundup, emphasizes that the market is shifting from episodic, training-dominated spending to continuous, query-driven inference expenditures, and cites Deloitte and market research for the claim that inference now represents a much larger share of AI compute and life-cycle cost.
Context and Significance
Companies and platforms that monetize per-token or per-query are operating in an environment where lower unit pricing does not automatically reduce vendor revenue or customer bills. Reporting highlights two structural factors: the marginal-cost nature of inference billing, and rapid growth of automated agents and production workflows that scale token consumption. BusinessEngineer cites Deloitte and Fortune Business Insights for metrics on the growing inference market, and Apollo cites Bloomberg/Macrobond data underlying the Token Expenditure Index.
What to Watch
Observers should monitor three indicators over coming quarters: the trajectory of the Silicon Data Token Expenditure Index or comparable industry metrics reported by Bloomberg/Macrobond; corporate disclosure of inference spending or anomalous line-item overruns in IT and R&D budgets, as highlighted in trade reporting; and vendor pricing changes, bundled offers, or new metering models from major API providers that could alter per-query incentives. For practitioners, the immediate operational implication is a renewed emphasis on cost observability and governance at production scale -- stronger telemetry on token consumption per workflow, configurable throttles for agent fleets, and chargeback models to prevent surprise spend.
Quoted Reporting and Sourced Claims
Per BusinessEngineer, the inference market metrics include claims that inference accounts for approximately two-thirds of AI compute in 2026 (Deloitte) and that inference-related markets are expanding rapidly. Technology.org reports anecdotal examples of very large corporate bills. Apollo's slide deck attributes the Token Expenditure Index construction to Bloomberg and Macrobond.
Limitations
Public coverage relies on a constructed index and secondary reporting. The Token Expenditure Index is a proxy for aggregate LLM spending and is not identical to vendor revenues. Where sources make firm claims about individual corporate spend, those are anecdotal and should be treated as such unless supported by company filings or audited disclosures.
Scoring Rationale
A well-sourced market analysis documenting the Jevons paradox effect on AI inference spending, anchored by an Apollo Global chief economist slide deck. The trend is real and practitioner-relevant but this is a secondary economic analysis piece rather than a primary product launch or research result. Solid-to-low-notable range.
Practice with real Ad Tech data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ad Tech problems

