Systematic Review Evaluates ML Models Predicting Future Falls

According to the JMIR preprint by Gao et al. (2026), the authors screened 6,865 records and included 27 longitudinal studies that developed or validated machine learning (ML) or deep learning (DL) models to predict future falls in community-dwelling adults aged >=60. Per the paper, 22 studies used ML and 5 used DL; prediction horizons ranged from 3 months to 4 years and reported fall incidence varied from 1.6% to 46.6%. Gao et al. searched major databases through August 31, 2025, excluded real-time detection, simulated/no-fall and inpatient studies, assessed risk of bias with PROBAST, and meta-analyzed logit-transformed AUCs using the Hartung-Knapp-Sidik-Jonkman (HKSJ) random-effects method with 95% prediction intervals estimated by four approaches. The authors also ran leave-one-out robustness checks and examined small-study effects with funnel plots and Egger-type tests.
What happened
According to the JMIR preprint by Gao et al., the authors performed a systematic review and meta-analysis of ML and DL models predicting future falls in community-dwelling older adults. The review screened 6,865 records and included 27 longitudinal studies, of which 18 focused on general community samples. Included studies applied 22 ML approaches and 5 DL approaches, with prediction horizons from 3 months to 4 years and observed fall incidence between 1.6% and 46.6%.
Technical details
Gao et al. searched PubMed, Embase, Web of Science, CINAHL, Cochrane Library, and IEEE Xplore through August 31, 2025, and limited inclusion to longitudinal studies using baseline predictors to forecast future falls in adults aged >=60, excluding real-time detection, simulated/no-fall, and inpatient studies. Risk of bias was assessed using PROBAST. For quantitative synthesis the authors meta-analyzed logit-transformed AUCs using the Hartung-Knapp-Sidik-Jonkman (HKSJ) random-effects model, reported 95% confidence intervals, and conveyed expected out-of-sample performance with 95% prediction intervals computed by four estimators. Robustness was examined with leave-one-out analyses and small-study effects were probed with funnel plots and Egger-type tests.
Editorial analysis - technical context
Prognostic model research for low-incidence events commonly shows wide heterogeneity in outcome definitions, follow-up horizon, and predictor sets, which complicates meta-analysis of discrimination metrics. Industry-pattern observations note that pooling AUCs across studies with divergent event rates and follow-up lengths often produces broad prediction intervals, limiting the precision of expected out-of-sample performance estimates.
Context and significance
For practitioners building or evaluating fall-prediction models, systematic syntheses such as this map the current evidence base and methodological choices used across studies, including the limited number of DL studies to date. Industry context: external validation frequency, calibration reporting, and standardized outcome definitions are recurring gaps in prognostic ML literature that materially affect deployability and risk communication in clinical settings.
What to watch
Indicators to follow include subsequent external-validation studies on large community cohorts, increased reporting of calibration and net benefit, consensus on outcome windows for fall prediction, and whether future work narrows heterogeneity by standardizing predictor sets and evaluation protocols. Observers will also watch for studies that report decision-analytic impacts or prospectively evaluate models in real-world workflows.
Scoring Rationale
This systematic review compiles longitudinal evidence on ML/DL fall-prediction models and applies robust meta-analytic techniques, making it a notable resource for practitioners assessing prognostic model evidence. It is not a frontier-model release, so importance is notable rather than transformative.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems


