TY - JOUR
T1 - Mutual information
T2 - Measuring nonlinear dependence in longitudinal epidemiological data
AU - Young, Alexander L.
AU - van den Boom, Willem
AU - Schroeder, Rebecca A.
AU - Krishnamoorthy, Vijay
AU - Raghunathan, Karthik
AU - Wu, Hau Tieng
AU - Dunson, David B.
N1 - Publisher Copyright:
© 2023 Young et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2023/4
Y1 - 2023/4
N2 - Given a large clinical database of longitudinal patient information including many covariates, it is computationally prohibitive to consider all types of interdependence between patient variables of interest. This challenge motivates the use of mutual information (MI), a statistical summary of data interdependence with appealing properties that make it a suitable alternative or addition to correlation for identifying relationships in data. MI: (i) captures all types of dependence, both linear and nonlinear, (ii) is zero only when random variables are independent, (iii) serves as a measure of relationship strength (similar to but more general than R2), and (iv) is interpreted the same way for numerical and categorical data. Unfortunately, MI typically receives little to no attention in introductory statistics courses and is more difficult than correlation to estimate from data. In this article, we motivate the use of MI in the analyses of epidemiologic data, while providing a general introduction to estimation and interpretation. We illustrate its utility through a retrospective study relating intraoperative heart rate (HR) and mean arterial pressure (MAP). We: (i) show postoperative mortality is associated with decreased MI between HR and MAP and (ii) improve existing postoperative mortality risk assessment by including MI and additional hemodynamic statistics.
AB - Given a large clinical database of longitudinal patient information including many covariates, it is computationally prohibitive to consider all types of interdependence between patient variables of interest. This challenge motivates the use of mutual information (MI), a statistical summary of data interdependence with appealing properties that make it a suitable alternative or addition to correlation for identifying relationships in data. MI: (i) captures all types of dependence, both linear and nonlinear, (ii) is zero only when random variables are independent, (iii) serves as a measure of relationship strength (similar to but more general than R2), and (iv) is interpreted the same way for numerical and categorical data. Unfortunately, MI typically receives little to no attention in introductory statistics courses and is more difficult than correlation to estimate from data. In this article, we motivate the use of MI in the analyses of epidemiologic data, while providing a general introduction to estimation and interpretation. We illustrate its utility through a retrospective study relating intraoperative heart rate (HR) and mean arterial pressure (MAP). We: (i) show postoperative mortality is associated with decreased MI between HR and MAP and (ii) improve existing postoperative mortality risk assessment by including MI and additional hemodynamic statistics.
UR - http://www.scopus.com/inward/record.url?scp=85153990042&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85153990042&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0284904
DO - 10.1371/journal.pone.0284904
M3 - Article
C2 - 37099536
AN - SCOPUS:85153990042
SN - 1932-6203
VL - 18
JO - PloS one
JF - PloS one
IS - 4 April
M1 - e0284904
ER -