Infrastructurememory chipsjefferiesinfrastructure demandglm 5.2

Jefferies Says Cheaper AI Models Boost Infrastructure Demand

|
5.8
Relevance Score
Jefferies Says Cheaper AI Models Boost Infrastructure Demand
Photo: bl-i.thgim.com · rights & takedowns

Jefferies, in a research note reported by ANI on Jun 26, 2026, argues that the emergence of lower-cost AI models is unlikely to reduce AI investment and could increase demand for computing infrastructure. The brokerage highlights Chinese model GLM-5.2 from Z.ai as delivering performance close to leading enterprise models at lower operating cost and quotes the report calling the moment another "DeepSeek moment." Jefferies' report also states that falling token costs are encouraging more firms to run inference on local servers rather than public cloud, and that OpenRouter usage shows rapid uptake of lower-cost Chinese models. The note invokes Jevons Paradox to explain why lower per-unit inference cost could raise overall compute consumption, and it explicitly flags memory chip makers, especially DRAM suppliers, as beneficiaries. Jefferies is quoted as saying there is "zero sign of AI capex slowing."

What happened

Jefferies' research note, summarized by ANI on Jun 26, 2026, argues that the arrival of lower-cost AI models is more likely to expand overall infrastructure demand than to slow AI investment. The note highlights Chinese model GLM-5.2 from Z.ai, saying it delivers performance close to leading enterprise models at a substantially lower operating cost. Jefferies' report includes the quoted line, "The past week has seen another DeepSeek moment," and states that "GLM-5.2 proves enterprises no longer have to sacrifice intelligence for privacy. We are seeing a massive acceleration in companies pulling their AI workloads out of the public cloud and back onto local corporate servers." The report also cites rising usage share for Chinese models on OpenRouter and explicitly invokes Jevons Paradox to argue that lower inference costs could increase aggregate compute consumption. The note concludes there is "zero sign of AI capex slowing," and identifies memory chip makers, notably DRAM suppliers, as key beneficiaries. Jefferies added South Korean memory maker SK Hynix and Japanese flash memory company Kioxia to its model portfolios, increased its weighting in Samsung Electronics, and reduced exposure to internet companies such as Alphabet and Alibaba.

Editorial analysis - technical context

Lower per-inference costs for base and open models typically shift the cost-performance trade-offs that determine where inference runs. Industry-pattern observations: when inference becomes cheaper, organisations often expand use cases and increase throughput, which can raise peak and aggregate compute demand. That pattern tends to benefit bandwidth- and capacity-sensitive components such as DRAM and high-bandwidth memory, because larger context windows, batch sizes, and parallel inference raise memory footprint and bandwidth needs.

Industry context

Reuters commentary from Dec 2025, citing market research and analyst desks, placed memory demand driven by AI as the strongest since the 1990s PC boom and reported sharp DRAM price increases in periods of tight supply. Combining that market backdrop with Jefferies' note frames a persistent demand-side pressure on memory pricing and capacity, which market participants track when sizing data-centre and chip investments.

For practitioners

Monitor three indicators: on-premise inference adoption rates and benchmarked inference costs; OpenRouter and similar routing-layer usage statistics showing model share shifts; and DRAM pricing and lead times from major suppliers. Industry observers will watch whether lower inference costs translate into new production orders or higher utilisation of existing fleets, which affects procurement timing and lifecycle planning for accelerators and memory subsystems.

What to watch

Jefferies' assertion that AI capex shows "zero sign" of slowing is a high-level market view; follow hyperscaler capital-spend disclosures, vendor order books, and public cloud instance mix to validate whether the on-prem shift and increased memory demand materialise at scale.

Key Points

  • 1Jefferies says lower-cost models like `GLM-5.2` reduce operating cost, encouraging broader enterprise deployment and on-prem inference.
  • 2Lower per-inference cost can raise aggregate compute demand (Jevons Paradox), increasing pressure on memory capacity and DRAM pricing.
  • 3Practitioners should track on-prem adoption, OpenRouter model-share trends, and DRAM price/lead-time signals to anticipate infrastructure needs.

Scoring Rationale

A broker research note arguing cheaper AI models (GLM-5.2) paradoxically boost infrastructure demand, with specific portfolio moves flagging SK Hynix and Kioxia. Relevant market intelligence for practitioners sizing infra investments, but this is one analyst firm's view on a market trend rather than a primary news event or product launch.

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems