Amazon Considers Qualcomm AI200 Chips for AWS

A Wells Fargo research note reported by Wccftech says Amazon Web Services (AWS) could become the lead hyperscale customer for Qualcomm's AI200 inference chip, which supports up to 768GB of memory per card and is slated for a 2026 commercial rollout. Wells Fargo models the AI200's deployment economics at roughly $3.5 billion per gigawatt and estimates an illustrative $2.50 earnings-per-share lift for Qualcomm if accelerator density per rack increases. The bank cites comments from Qualcomm CEO Cristian Amon and frames the deal as part of AWS's broader push to cut per-token inference costs, though neither AWS nor Qualcomm has confirmed anything; the claim currently rests on one analyst's reading of public comments.
For AI infrastructure teams, this note is a live example of how hyperscalers now shop for inference chips: not on raw FLOPS, but on memory capacity per accelerator and effective cost per token - the exact metrics Wells Fargo uses to argue Qualcomm's AI200 could win AWS as a customer.
What happened
Wccftech reports on a Wells Fargo research note arguing that Qualcomm could deepen its AI chip relationship with AWS around the AI200 accelerator. Qualcomm's own October 2025 launch announcement confirms the AI200 supports up to 768GB of LPDDR memory per card, is built for rack-scale large-language-model inference, and is expected to be commercially available in 2026 (a follow-on chip, the AI250, is slated for 2027 with a different near-memory architecture). Wells Fargo, in the note as reported by Wccftech, estimates AI200 deployment costs of about $3.5 billion per gigawatt and models a $2.50 earnings-per-share benefit for Qualcomm if it increases accelerators per rack. The bank states, based on "company comments" and its own analysis, that it sees AWS as "the potential lead hyperscale ASIC partner," pointing to comments from Qualcomm CEO Cristian Amon and to AWS's existing use of Qualcomm's earlier AI100 Ultra chip.
Technical context
The AI200's headline differentiator is memory capacity: 768GB per card lets a single accelerator hold larger models in memory, reducing the cross-chip communication that typically slows multi-chip inference. That plays directly into the industry's shift toward per-token pricing, where hardware efficiency (tokens served per dollar) has become as important as peak throughput - the same economics driving interest in inference-specialized chips like Groq's.
For practitioners
Treat this as a single analyst's read of public signals, not a confirmed deal: neither AWS nor Qualcomm has announced a partnership, and the $3.5B-per-gigawatt and $2.50-EPS figures are Wells Fargo's own modeling, not disclosed company guidance. If you're evaluating inference hardware, the more durable signal is the metric Wells Fargo is using to judge winners - effective dollar-per-token cost at rack scale - rather than the specific vendor pairing. Expect more speculative sourcing like this as hyperscalers stay quiet about custom-silicon roadmaps while the market prices in who wins the ASIC business.
What to watch
Watch for confirmation or denial from AWS or Qualcomm, any AWS earnings-call or re:Invent commentary on custom AI accelerators, and independent benchmark data comparing the AI200's real-world dollar-per-token performance to Nvidia GPUs, AWS's own Trainium and Inferentia chips, and Groq. Qualcomm's AI250, with its near-memory architecture, is due in 2027 and could shift this calculus again.
Key Points
- 1A Wells Fargo note reported by Wccftech says AWS could become Qualcomm's lead hyperscale customer for its AI200 inference chip.
- 2Qualcomm's AI200 differentiates on memory capacity, 768GB per card, which can lower cost per token during large-model inference at scale.
- 3Neither AWS nor Qualcomm has confirmed a deal, so the specific EPS and cost figures remain one analyst's modeling, not settled fact.
Scoring Rationale
This story rests entirely on one Wells Fargo analyst note interpreting public comments; neither AWS nor Qualcomm has confirmed any deal, so it is scored as an unconfirmed signal rather than a done deal (down from 6.8). The underlying AI200 chip specs are independently confirmed via Qualcomm's own press release, and the memory-per-dollar inference economics it points to are a genuine, notable dynamic for hyperscale infrastructure buyers, which keeps it above the marginal band.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
