Audit Finds Fabricated Citations Across Biomedical Literature

An audit led by Maxim Topaz and colleagues, published as a correspondence in The Lancet, found thousands of fabricated references in the biomedical literature. The authors reported verifying 97.1 million references from an audit of roughly 2.5 million papers and identifying 4,046 fabricated references that appeared across 2,810 papers, per The Lancet correspondence. The study found a roughly 12-fold rise in fabrication rates from 2023 through 2025, reaching about one in 277 papers in early 2026, and noted that nearly all flagged papers had seen "no publisher action" at the time of the audit, according to Retraction Watch. The audit screened the PubMed Central Open Access subset, not the whole PubMed database, a correction noted by Nature. Maxim Topaz told CBS News, "Your doctor could be making decisions around treatment based on studies that never existed."
What happened
The correspondence by Maxim Topaz and colleagues, published in The Lancet, reports an AI-assisted audit of approximately 2.5 million biomedical papers drawn from the PubMed Central Open Access subset. The authors say they verified 97.1 million structured references and identified 4,046 references the team characterized as fabricated, appearing across 2,810 papers, per The Lancet correspondence. The correspondence and reporting in outlets including Nature, CIDRAP, CBS News, and Retraction Watch describe a roughly 12-fold increase in fabrication rates from 2023 to 2025, with an early-2026 rate of about one in 277 papers, the audit authors report.
Technical details
The audit team used an automated reference-verification pipeline that compared citing metadata with bibliographic records retrieved from PubMed, CrossRef, OpenAlex, and Google Scholar, and applied filters to reduce false positives. CIDRAP reports the system achieved about 91% precision after filtering. The team labeled a reference as fabricated when no corresponding record was found in those reference databases. Nature corrected initial coverage to note the study screened the PubMed Central Open Access subset rather than the entire PubMed index.
Editorial analysis - methodological caveats
Industry observers: Automated verification across tens of millions of references can scale detection but depends on the scope and coverage of the underlying databases. The audit screened the PMC Open Access corpus, which omits paywalled content; therefore the measured rates describe that subset and may not directly generalize to the full published literature. Precision reported by the authors implies some residual false positives remain; reported filtering and human checks mitigate but do not eliminate classification uncertainty.
What the sources report about causes and impact
Reporting across Retraction Watch, CIDRAP, and Nature cites the correspondence warning that fabricated references can originate from multiple sources, including paper mills, intentional misconduct, and uncritical use of large language models. Retraction Watch and CIDRAP note the sharpest increase in fabricated references beginning in mid-2024, a period when public use of LLMs and AI writing tools grew. CBS News quotes Maxim Topaz saying, "Your doctor could be making decisions around treatment based on studies that never existed," highlighting concerns about downstream impacts on clinical guidelines and patient care.
Industry context
Industry observers: The finding joins a wider conversation about metadata integrity and provenance in scholarly publishing that predated generative-AI adoption but now intersects with LLM failure modes such as hallucinated citations. Publishers and journals quoted by Retraction Watch and other outlets report investing in integrity screening tools and publication-ethics workflows; PLOS told Retraction Watch it is "exploring options for system-wide reference integrity screening," per Retraction Watch.
What to watch
- •Whether journals and publishers expand automated reference-checking to cover submission pipelines and postpublication audits.
- •Development and adoption of standardized APIs or services that can verify DOIs and metadata at scale across paywalled and open content.
- •Follow-up studies that replicate the audit across broader corpora or apply different verification heuristics to quantify false-positive and false-negative rates.
For practitioners
Industry observers: Data scientists, journal editors, and research-software engineers should treat reference verification as a measurable metadata quality problem amenable to tooling. Integrations that cross-check claimed identifiers against multiple bibliographic sources and flag mismatches can reduce propagation of fictitious citations, but they require dataset coverage and calibration to minimize nuisance alerts.
Closing factual note
The Lancet correspondence and subsequent reporting indicate that, at the time of the audit, most papers with flagged references had not been corrected or retracted, per Retraction Watch and the audit authors' summary in The Lancet.
Scoring Rationale
The audit documents a measurable integrity failure that intersects directly with clinical research and guideline formation, making it important for ML practitioners building authoring tools, publishers, and research-data engineers. The finding is notable but not a frontier model release, so it rates as a significant, actionable risk.
Practice with real Health & Insurance data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Health & Insurance problems


