What happened
A study by Novelene Goklish et al., published in J Med Internet Res (2026) and indexed on PubMed, performed a retrospective secondary analysis of electronic health record data from adult Emergency Department visits at an Indian Health Service facility between October 1, 2019 and October 2, 2021. The authors compared strategies that combined an EHR-based machine learning risk model with the Ask Suicide-Screening Questions (ASQ) to using the ASQ alone. Performance metrics reported in the paper, summarized on JoVE, include sensitivity, specificity, and predictive values evaluated across 10 cross-validated folds.
Technical details
Per the methods reported in the paper, the study implemented a logistic regression ML model and evaluated two operational thresholds, the 95th percentile (high risk) and the 75th percentile (medium risk). The authors compared three operational strategies: ML-only at thresholds, serial testing (ML high-risk followed by ASQ), and parallel testing (flag if either ML medium-risk or ASQ positive). Reported results in the article and JoVE summary show the ML medium-risk threshold alone achieved sensitivity 0.782, parallel testing reached sensitivity 0.795 without missing ASQ-identified cases, and serial testing maximized positive predictive ability with PPV 0.050 while lowering sensitivity.
Industry context
Editorial analysis: Models for low-prevalence clinical outcomes, such as suicide attempts, commonly trade higher sensitivity against low positive predictive value, producing many false positives in routine care. Combining a broadly sensitive ML signal with a brief, validated instrument like the ASQ follows an established pattern in diagnostic workflows where one tool screens widely and another refines risk. Evaluation in a defined clinical population, here American Indian and Alaska Native patients at an IHS site, is essential because baseline prevalence and EHR data completeness affect absolute performance and equity of predictions.
What to watch
For practitioners: look for external validation and prospective pilot studies that measure workflow impact, clinician burden from false positives, acceptability in tribal health settings, and whether combined strategies materially change follow-up actions or outcomes. Also monitor reporting on calibration, subgroup performance across demographic strata, and any assessment of unintended harms or community-engaged governance described in follow-up work.
Key Points
- 1A retrospective J Med Internet Res study shows combining an EHR ML model with ASQ raises sensitivity versus ASQ alone, improving case detection.
- 2Industry context: Low-prevalence outcomes produce low PPV; combining a sensitive ML screen with a brief instrument follows common diagnostic tradeoffs.
- 3For practitioners: External validation, prospective pilots, and subgroup calibration are critical before operational deployment in IHS and tribal settings.
Scoring Rationale
This is a notable applied ML study with direct implications for clinical screening workflows and equity in a high-risk population. It is primarily relevant to practitioners integrating EHR-based models with screening tools; broader impact depends on external validation and prospective implementation.
Practice with real Health & Insurance data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Health & Insurance problems


