Executives Measure AI ROI Using Outcomes Not Tokens
At Mistral AI's summit in Paris, Business Insider asked four enterprise executives how they measure AI return on investment, and none started with token usage. Charles Holive, chief AI officer at BNP Paribas CIB, said, 'We try to go away from vanity metrics - billions of tokens per day,' and instead asks what teams can now do that they could not before, and how much faster. Antoine Pichot of La Banque Postale said the bank tracks employee efficiency, customer service, and value for money. Tata Consultancy Services' Amit Kapur emphasized business outcomes over token counts, and NTT DATA's Sujay Bhattacharya said customers increasingly look beyond tokens to total cost and business value. The comments reflect a broader shift, also covered by Fortune, away from 'tokenmaxxing' as a proxy for AI value.
What happened
Business Insider spoke with four executives at Mistral AI's summit in Paris about how they measure AI ROI, and none led with token usage. Charles Holive, chief AI officer at BNP Paribas CIB, told Business Insider, "We try to go away from vanity metrics - billions of tokens per day," adding, "What did you do, you didn't do before? How much faster did you do it?" Antoine Pichot, director of innovation, digital and data at La Banque Postale, told Business Insider the bank measures whether AI makes employees more efficient, improves customer service, and delivers value for money. Amit Kapur, chief AI and transformation officer at Tata Consultancy Services, told Business Insider he focuses on whether AI improves business performance rather than token consumption. Sujay Bhattacharya, executive managing director at NTT DATA, told Business Insider his customers increasingly look beyond token counts to total cost and business value.
Industry context
Enterprise teams have commonly tracked low-level usage metrics such as API calls or token volumes as an early proxy for adoption. The executive comments, and related coverage by Fortune on the decline of "tokenmaxxing," reflect a shift toward outcome-oriented KPIs, where productivity, customer outcomes, and cost-per-impact matter more than raw model consumption.
Editorial analysis - technical context
For practitioners, moving from token counts to outcome metrics typically entails instrumenting feature-level telemetry and tying model outputs to business KPIs. Observed patterns in similar transitions include measurement layers that capture latency, error rates, time saved, automation rates, and downstream business metrics such as conversion lift or cost-per-ticket. These are familiar implementation challenges for data and ML teams integrating LLMs into workflows.
What to watch
- •Whether enterprises standardize outcome KPIs for AI, such as time-to-decision, resolution time, or revenue-per-employee.
- •How engineering and ML teams map model-level metrics (latency, token usage) to business outcomes in dashboards and SLAs.
- •Vendor and procurement responses: whether contracts and pricing shift from token-based to outcome- or value-based models.
For practitioners
The executives quoted by Business Insider illustrate a pragmatic framing: reporting and instrumentation that tie model outputs to concrete business outcomes make ROI conversations actionable. Teams planning evaluation should prioritize defining measurable business outcomes and the telemetry needed to attribute changes to AI interventions.
Key Points
- 1Enterprise leaders prioritize outcome metrics - efficiency, customer service, and business performance - over raw token-usage counts.
- 2The shift from 'tokenmaxxing' to outcome-based measurement is driven by skepticism that token volume proxies real AI value.
- 3For practitioners, outcome-based ROI requires instrumenting business telemetry and attribution paths that tie model outputs to measurable results.
Scoring Rationale
A summit roundup of how enterprise leaders measure AI ROI, anchored on named executives and verbatim quotes reported by Business Insider, with the broader 'tokenmaxxing as a vanity metric' theme also covered by Fortune. It is practitioner-relevant industry commentary rather than a product, research, or funding event, so it sits in the mid-range.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems