Researchllmgh200 superchipfp8 sparseopen source
ETH Zurich Trains Open LLMs on Alps
8.2
Relevance Score
Researchers from ETH Zürich and EPFL this week revealed they trained two open large language models using Switzerland's Alps supercomputer at the International Open-Source LLM Builders Summit in Geneva. The models — roughly 8 billion and 70 billion parameters trained on about 15 trillion tokens across 1,000+ languages (40% non-English) — exploited Alps' Nvidia GH200 Superchip architecture and FP8 sparse performance. The team plans a fully open release this summer under Apache 2.0, including training code and transparent data.



