Funding & Businessinference economytokensjevons paradoxcost modeling

Cheaper Tokens Drive Higher AI Token Spending

|June 15, 2026|By LDS Team

6.0

Relevance Score

Cheaper Tokens Drive Higher AI Token Spending — Photo: assets.realclear.com · rights & takedowns

The Silicon Data Token Expenditure Index, which tracks aggregate spending on large language model usage, roughly doubled since late 2025 while the price per token fell about 90% since 2023, according to a June 12 presentation by Torsten Slok at Apollo Global Management (Apollo). Reporting reproduced by Edward Conard and others cites the same index. Analysts and commentators frame the pattern as an instance of Jevons paradox: lower unit costs prompt much higher aggregate consumption. Industry coverage also highlights related signs in the inference economy, including larger share of compute devoted to inference and high bills at some corporations, as reported by BusinessEngineer and Technology.org.

What happened

Per a June 12 slide deck by Torsten Slok at Apollo Global Management, the Silicon Data Token Expenditure Index has roughly doubled since late 2025, even as the price of a single token has fallen by about 90% since 2023. Edward Conard's summary and other outlets reproduce the same index finding. Technology.org and other trade coverage report anecdotal examples of runaway bills, including claims that some companies spent unusually large portions of their 2026 AI budgets on inference usage.

Technical Context

Industry reporting frames this as an instance of Jevons paradox, a classical economics observation that improvements in efficiency or lower unit costs can increase total consumption. Business-oriented coverage, including a BusinessEngineer roundup, emphasizes that the market is shifting from episodic, training-dominated spending to continuous, query-driven inference expenditures, and cites Deloitte and market research for the claim that inference now represents a much larger share of AI compute and life-cycle cost.

Context and Significance

Companies and platforms that monetize per-token or per-query are operating in an environment where lower unit pricing does not automatically reduce vendor revenue or customer bills. Reporting highlights two structural factors: the marginal-cost nature of inference billing, and rapid growth of automated agents and production workflows that scale token consumption. BusinessEngineer cites Deloitte and Fortune Business Insights for metrics on the growing inference market, and Apollo cites Bloomberg/Macrobond data underlying the Token Expenditure Index.

What to Watch

Observers should monitor three indicators over coming quarters: the trajectory of the Silicon Data Token Expenditure Index or comparable industry metrics reported by Bloomberg/Macrobond; corporate disclosure of inference spending or anomalous line-item overruns in IT and R&D budgets, as highlighted in trade reporting; and vendor pricing changes, bundled offers, or new metering models from major API providers that could alter per-query incentives. For practitioners, the immediate operational implication is a renewed emphasis on cost observability and governance at production scale -- stronger telemetry on token consumption per workflow, configurable throttles for agent fleets, and chargeback models to prevent surprise spend.

Quoted Reporting and Sourced Claims

Per BusinessEngineer, the inference market metrics include claims that inference accounts for approximately two-thirds of AI compute in 2026 (Deloitte) and that inference-related markets are expanding rapidly. Technology.org reports anecdotal examples of very large corporate bills. Apollo's slide deck attributes the Token Expenditure Index construction to Bloomberg and Macrobond.

Limitations

Public coverage relies on a constructed index and secondary reporting. The Token Expenditure Index is a proxy for aggregate LLM spending and is not identical to vendor revenues. Where sources make firm claims about individual corporate spend, those are anecdotal and should be treated as such unless supported by company filings or audited disclosures.

Key Points

1Reported data shows the Silicon Data Token Expenditure Index doubled since late 2025 while token prices fell roughly 90%, demonstrating rising aggregate spend.
2Lower per-token pricing can increase total consumption, a modern example of Jevons paradox applied to the inference economy.
3Practitioners should prioritize token-level telemetry, spend governance, and metering strategies as inference shifts to a continuous cost center.

Scoring Rationale

A well-sourced market analysis documenting the Jevons paradox effect on AI inference spending, anchored by an Apollo Global chief economist slide deck. The trend is real and practitioner-relevant but this is a secondary economic analysis piece rather than a primary product launch or research result. Solid-to-low-notable range.

Sources

Public references used for this report.

6 sources

arxiv.orgPhotons = Tokens: The Physics of AI and the Economics of Knowledge

apollo.comCheaper Tokens, Bigger Bills

technology.orgAI Tokenmaxxing Is Dying, and Silicon Valley Knows It

View 3 more sources

Practice with real Ad Tech data

90 SQL & Python problems · 15 industry datasets

Used by DS/ML engineers at top companies

Active Search Campaigns by BudgetEasy

High CPC Clicks & Poor Landing PagesMedium

Campaign ROAS by Attribution ModelHard

250 free problems · No credit card

See all Ad Tech problems

Funding & Businessinference economytokensjevons paradoxcost modeling