Taalas, a 2.5-year-old chip startup, is transforming LLM deployment by converting AI models into custom ASICs and showcased HC1, a TSMC 6nm chip embedding Meta's Llama 3.1 8B. The company states its "Hardcore Models" deliver approximately 10× token-per-second throughput and 20× lower production costs versus high-end software infrastructure, and it demonstrated a 30-chip cluster reaching 12,000 TPS per user.

Key Points

1Maps LLMs into custom ASICs, producing HC1 chip embedding Meta's Llama 3.1 8B model
2Achieves claimed 10× TPS and 20× lower production costs by merging storage and computation on silicon
3Requires per-model hardware; scales via clusters (30-chip demo reached 12,000 TPS/user), limiting weight updates

Scoring Rationale

Strong performance claims and demonstrable cluster scaling, limited by single-source company claims and narrow model scale.

Sources

Public references used for this report.

2 sources

01wccftech.comThis New AI Chipmaker, Taalas, Hard-Wires AI Models Into Silicon to Make Them Faster and Cheaper; Early Results Crush Modern Solutions

02cnx-software.comTaalas HC1 hardwired Llama-3.1 8B AI accelerator delivers up to 17,000 tokens/s

Practice with real Logistics & Shipping data

90 SQL & Python problems · 15 industry datasets

Used by DS/ML engineers at top companies

High-Value Overnight OrdersEasy

Delivered International ShipmentsMedium

On-Time Delivery Rate by CarrierHard

250 free problems · No credit card

See all Logistics & Shipping problems

Key Points

1Maps LLMs into custom ASICs, producing HC1 chip embedding Meta's Llama 3.1 8B model
2Achieves claimed 10× TPS and 20× lower production costs by merging storage and computation on silicon
3Requires per-model hardware; scales via clusters (30-chip demo reached 12,000 TPS/user), limiting weight updates