Models & Researchsuicide preventionelectronic health recordsmachine learningamerican indian

ML and Screening Improve Suicide Risk Detection for American Indian Patients

||By LDS Team
6.3
Relevance Score
ML and Screening Improve Suicide Risk Detection for American Indian Patients
Photo: asset.jmir.pub · rights & takedowns

A retrospective study by Goklish et al., published in J Med Internet Res (2026) and indexed on PubMed, evaluated combining an electronic health record (EHR) based machine learning model with the Ask Suicide-Screening Questions (ASQ) at an Indian Health Service (IHS) Emergency Department. The cohort covered adult ED visits from October 1, 2019 to October 2, 2021, and performance was assessed using 10 cross-validated folds, per the paper. The authors operationalized a logistic regression model at 95th (high-risk) and 75th (medium-risk) percentiles. According to the J Med Internet Res article and a JoVE summary, the ML medium-risk threshold alone showed sensitivity 0.782, parallel testing (ML medium-risk alongside ASQ) achieved sensitivity 0.795 while capturing cases missed by ASQ, and a serial strategy (ML high-risk then ASQ) produced the highest positive predictive value (PPV 0.050) but reduced sensitivity.

What happened

A study by Novelene Goklish et al., published in J Med Internet Res (2026) and indexed on PubMed, performed a retrospective secondary analysis of electronic health record data from adult Emergency Department visits at an Indian Health Service facility between October 1, 2019 and October 2, 2021. The authors compared strategies that combined an EHR-based machine learning risk model with the Ask Suicide-Screening Questions (ASQ) to using the ASQ alone. Performance metrics reported in the paper, summarized on JoVE, include sensitivity, specificity, and predictive values evaluated across 10 cross-validated folds.

Technical details

Per the methods reported in the paper, the study implemented a logistic regression ML model and evaluated two operational thresholds, the 95th percentile (high risk) and the 75th percentile (medium risk). The authors compared three operational strategies: ML-only at thresholds, serial testing (ML high-risk followed by ASQ), and parallel testing (flag if either ML medium-risk or ASQ positive). Reported results in the article and JoVE summary show the ML medium-risk threshold alone achieved sensitivity 0.782, parallel testing reached sensitivity 0.795 without missing ASQ-identified cases, and serial testing maximized positive predictive ability with PPV 0.050 while lowering sensitivity.

Industry context

Editorial analysis: Models for low-prevalence clinical outcomes, such as suicide attempts, commonly trade higher sensitivity against low positive predictive value, producing many false positives in routine care. Combining a broadly sensitive ML signal with a brief, validated instrument like the ASQ follows an established pattern in diagnostic workflows where one tool screens widely and another refines risk. Evaluation in a defined clinical population, here American Indian and Alaska Native patients at an IHS site, is essential because baseline prevalence and EHR data completeness affect absolute performance and equity of predictions.

What to watch

For practitioners: look for external validation and prospective pilot studies that measure workflow impact, clinician burden from false positives, acceptability in tribal health settings, and whether combined strategies materially change follow-up actions or outcomes. Also monitor reporting on calibration, subgroup performance across demographic strata, and any assessment of unintended harms or community-engaged governance described in follow-up work.

Key Points

  • 1A retrospective J Med Internet Res study shows combining an EHR ML model with ASQ raises sensitivity versus ASQ alone, improving case detection.
  • 2Industry context: Low-prevalence outcomes produce low PPV; combining a sensitive ML screen with a brief instrument follows common diagnostic tradeoffs.
  • 3For practitioners: External validation, prospective pilots, and subgroup calibration are critical before operational deployment in IHS and tribal settings.

Scoring Rationale

This is a notable applied ML study with direct implications for clinical screening workflows and equity in a high-risk population. It is primarily relevant to practitioners integrating EHR-based models with screening tools; broader impact depends on external validation and prospective implementation.

Practice with real Health & Insurance data

90 SQL & Python problems · 15 industry datasets

250 free problems · No credit card

See all Health & Insurance problems