by Steven Habbous, Peter C. Austin, Shabnam Balamchi, Davood Astaraky, Roozbeh Yousefi, Munaza Chaudhry, Erik Hellsten Introduction Risk adjustment is critical in observational epidemiology to control for confounding of the exposure-outcome relationship.
Accurate prediction of outcomes, such as mortality, can improve risk adjustment. In the present study, we compared logistic regression with a range of tree-based ensemble methods to predict 1-year mortality in the general population of Ontario, Canada.
Methods Ontario adults (age 18 years and older) who were alive as of January 1, 2022 were included. Using a window of up to 3 years, various measures of health and healthcare utilization were captured from administrative databases.
To predict 1-year mortality, we applied logistic regression, random forests, extremely randomized trees, adaptive boosting, gradient boosting, extreme gradient boosting, Newton boosting, and CatBoost. All models also included age and sex.
Performance was evaluated using the area under the ROC curve (AUROC), the area under the precision-recall curve (PR-AUC), the Brier score, and a quantile-based version of the Integrated Calibration Index (ICI), reported in the 30% test set.
PLOS ONE (Medicine) published a clinical update in Research Highlights on 23 Apr 2026.
The item focuses on Using tree-based ensemble methods to produce a population-based mortality risk score in Ontario, Canada.
Review the original article for the full source wording and details.