FACTS Benchmark Suite Establishes Factuality Standard

The FACTS Benchmark Suite, developed by the FACTS team with Kaggle, has been released to systematically evaluate LLM factual accuracy across four dimensions. The suite—comprising 3,513 curated examples across public and private splits and managed leaderboards—adds Parametric, Search, and Multimodal benchmarks alongside Grounding v2, reporting a FACTS Score; Gemini 3 Pro leads at 68.8% while no model exceeds 70% overall. The project aims to support ongoing research.
Scoring Rationale
Authoritative release and broad applicability raise the score; limited novelty and sub‑70% model accuracy curb transformational impact.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
