Open Datasets Amplify Flawed Medical Research

An open-access dataset uploaded to Kaggle contained unvalidated images claimed to detect autism, and by December 2025 over 90 published papers had incorporated the flawed data, prompting investigations and retractions. The incident exposes weaknesses in data governance across platforms, institutions, and journals, and experts recommend provenance systems like the Five Safes framework, third-party validation, and accredited registries to prevent similar harms.
Scoring Rationale
Highlights systemic governance risks and actionable frameworks, but lacks new empirical evidence and formal institutional mandates.
Practice with real Health & Insurance data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Health & Insurance problems

