Textual Errors Reduce Clinical Model Performance

A 2025 study by Sarwar et al. evaluates the impact of textual data quality on clinical machine learning, analyzing 35,774 MIMIC‑III and 6,336 Australian aged‑care patient notes and using the Mixtral LLM to inject errors into MIMIC and correct errors in the aged‑care dataset. They report Mixtral detected errors in 63% of notes, models tolerate <10% error rates but decline at ≥10%, and TF‑IDF outperforms embedding features.
Key Points
- 1Quantifies model tolerance: performance holds for textual error rates below about 10%.
- 2Shows TF‑IDF features outperform embedding representations across tasks, affecting feature selection decisions.
- 3Recommends dataset-quality evaluation and LLM-based correction for datasets with ≥10% textual error rates.
Scoring Rationale
Empirical quantification of error tolerance in clinical text models; limited by two datasets and single LLM correction approach.
Sources
Public references used for this report.
Practice with real Health & Insurance data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Health & Insurance problems
