LLM Framework Reduces Hallucination in Medical Feature Extraction

Manal Abumelha et al. (JMIR Med Inform 2025) develop a two-phase LLM framework combining instructing fine-tuning and confidence-regularization to extract medical features from clinical notes. Trained on 700 full and 100 few-shot samples and evaluated on USMLE Step-2 Clinical Skills splits (200 public, 1,839 private), it achieved F1 scores of 0.968–0.983 (full) and 0.960–0.973 (few-shot), while reducing hallucinations by 89.9% and missing features by 88.9%.
Key Points
- 1Achieved F1 0.968–0.983 full and 0.960–0.973 few-shot on USMLE Step-2 clinical-note dataset
- 2Reduced hallucinations by 89.9% and missing features by 88.9% versus a few-shot LLM baseline
- 3Enables reliable automated clinical-note assessment with minimal training data for resource-constrained settings
Scoring Rationale
Strong empirical gains and large hallucination reduction, but scope remains focused on medical note assessment limiting generalizability.
Sources
Public references used for this report.
Practice with real Health & Insurance data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Health & Insurance problems
