Researchers Develop LLM System Detecting Health Misinformation

Penn State researchers (2026) present an automated pipeline using LLMs and machine learning to detect health misinformation, analyze themes, and generate refutation arguments. Their BERT classifier reached 98% accuracy and reduced AI-generated false positives by 44%, while BERTopic produced the best topic-modeling metrics and prompt-derived sentence representations had 99.6% rater approval. The system was validated offline on an English COVID-19 dataset and not yet tested on live social media.
Key Points
- 1Achieved 98% accuracy classifying misinformation with BERT, reducing AI-generated false positives by 44%.
- 2Used BERTopic for topic modeling with coherence 0.41 and high topic-assignment coverage.
- 3Provides practitioners a pipeline for detection, topic extraction, and prompt-based refutations usable offline.
Scoring Rationale
Solid empirical results and usable pipeline, but tested offline on a COVID-19 dataset limiting real-world generalization.
Sources
Public references used for this report.
Practice with real Health & Insurance data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Health & Insurance problems