Engram raises $98 million to cut token costs

According to CNBC, AI memory startup Engram raised $98 million from investors including General Catalyst, Kleiner Perkins, Sequoia and OpenAI co-founder Andrej Karpathy. CNBC reports Engram, founded in October and operating with about 13 employees, says its models can match or outperform frontier labs while using up to 100 times fewer tokens. The company has signed customers including Microsoft, Notion and legal AI startup Harvey, CNBC reports. CNBC also reports Engram plans to use the funding to support compute and talent. Kleiner partner Leigh Marie Braswell is quoted describing the market problem as an "explosion of data, explosion of cost," and praising Engram's potential cost impact.
What happened
According to CNBC, Engram announced a $98 million funding round from investors including General Catalyst, Kleiner Perkins and Sequoia, and OpenAI co-founder Andrej Karpathy. CNBC reports the startup was founded in October, has about 13 employees, and counts customers including Microsoft, Notion and legal AI startup Harvey. CNBC reports Engram claims its models can match or outperform frontier labs while using up to 100 times fewer tokens. CNBC also reports the company plans to use the funding to support compute and talent. CNBC quotes Kleiner partner Leigh Marie Braswell: "You've got this explosion of data, explosion of cost," and "Engram comes in and basically maps out your organization and offers orders of magnitude cheaper output."
Editorial analysis - technical context
Memory systems and retrieval techniques are a common industry response to rising inference costs because they reduce the amount of model-context that must be processed per query. Companies building memory layers typically focus on selective retrieval, compression, or learned indexing to lower token counts per query. For practitioners, these approaches trade engineering complexity and storage/compute for lower per-request token consumption, which can substantially reduce cloud inference bills when models charge by token or context window.
Industry context
For practitioners: the broader AI market has seen newer, more sophisticated models with larger context windows and higher per-token pricing, an effect CNBC highlights when describing the rising cost environment. Startups marketing memory or retrieval systems aim to capture enterprise demand for cost-efficient deployments as teams limit developer access or throttle usage to control bills, CNBC reports.
What to watch
For practitioners: monitor measurable cost and latency outcomes from vendors claiming large token reductions, plus customer case studies that report end-to-end savings. Observers should also watch whether memory-layer solutions integrate with major model providers or require custom model retraining, and how vendors price memory vs model usage.
Scoring Rationale
A $98 million funding round from General Catalyst, Kleiner Perkins, and Sequoia for an AI memory startup with only 13 employees is a notable investment given rising inference-cost concerns in the industry. As a single-source CNBC report published the same day, independent corroboration is not yet available; the story is significant for practitioners tracking cost-reduction approaches in the inference stack.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

