TY - JOUR
T1 - Developing well-calibrated illness severity scores for decision support in the critically ill
AU - Cosgriff, Christopher V
AU - Celi, Leo Anthony
AU - Ko, Stephanie
AU - Sundaresan, Tejas
AU - Armengol de la Hoz, Miguel Ángel
AU - Kaufman, Aaron Russell
AU - Stone, David J
AU - Badawi, Omar
AU - Deliberato, Rodrigo Octavio
N1 - Publisher Copyright:
© 2019, The Author(s).
PY - 2019/12/1
Y1 - 2019/12/1
N2 - Illness severity scores are regularly employed for quality improvement and benchmarking in the intensive care unit, but poor generalization performance, particularly with respect to probability calibration, has limited their use for decision support. These models tend to perform worse in patients at a high risk for mortality. We hypothesized that a sequential modeling approach wherein an initial regression model assigns risk and all patients deemed high risk then have their risk quantified by a second, high-risk-specific, regression model would result in a model with superior calibration across the risk spectrum. We compared this approach to a logistic regression model and a sophisticated machine learning approach, the gradient boosting machine. The sequential approach did not have an effect on the receiver operating characteristic curve or the precision-recall curve but resulted in improved reliability curves. The gradient boosting machine achieved a small improvement in discrimination performance and was similarly calibrated to the sequential models.
AB - Illness severity scores are regularly employed for quality improvement and benchmarking in the intensive care unit, but poor generalization performance, particularly with respect to probability calibration, has limited their use for decision support. These models tend to perform worse in patients at a high risk for mortality. We hypothesized that a sequential modeling approach wherein an initial regression model assigns risk and all patients deemed high risk then have their risk quantified by a second, high-risk-specific, regression model would result in a model with superior calibration across the risk spectrum. We compared this approach to a logistic regression model and a sophisticated machine learning approach, the gradient boosting machine. The sequential approach did not have an effect on the receiver operating characteristic curve or the precision-recall curve but resulted in improved reliability curves. The gradient boosting machine achieved a small improvement in discrimination performance and was similarly calibrated to the sequential models.
UR - http://www.scopus.com/inward/record.url?scp=85135599397&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85135599397&partnerID=8YFLogxK
U2 - 10.1038/s41746-019-0153-6
DO - 10.1038/s41746-019-0153-6
M3 - Article
C2 - 31428687
SN - 2398-6352
VL - 2
SP - 76
JO - npj Digital Medicine
JF - npj Digital Medicine
IS - 1
M1 - 76
ER -