TY - JOUR
T1 - Inexpensive, non-invasive biomarkers predict Alzheimer transition using machine learning analysis of the Alzheimer’s Disease Neuroimaging (ADNI) database
AU - for the Alzheimer's Disease Neuroimaging Initiative
AU - Beltrán, Juan Felipe
AU - Wahba, Brandon Malik
AU - Hose, Nicole
AU - Shasha, Dennis
AU - Kline, Richard P.
N1 - Publisher Copyright:
© 2020 Beltrán et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2020/7
Y1 - 2020/7
N2 - The Alzheimer’s Disease Neuroimaging (ADNI) database is an expansive undertaking by government, academia, and industry to pool resources and data on subjects at various stage of symptomatic severity due to Alzheimer’s disease. As expected, magnetic resonance imaging is a major component of the project. Full brain images are obtained at every 6-month visit. A range of cognitive tests studying executive function and memory are employed less frequently. Two blood draws (baseline, 6 months) provide samples to measure concentrations of approximately 145 plasma biomarkers. In addition, other diagnostic measurements are performed including PET imaging, cerebral spinal fluid measurements of amyloid-beta and tau peptides, as well as genetic tests, demographics, and vital signs. ADNI data is available upon review of an application. There have been numerous reports of how various processes evolve during AD progression, including alterations in metabolic and neuroendocrine activity, cell survival, and cognitive behavior. Lacking an analytic model at the onset, we leveraged recent advances in machine learning, which allow us to deal with large, non-linear systems with many variables. Of particular note was examining how well binary predictions of future disease states could be learned from simple, non-invasive measurements like those dependent on blood samples. Such measurements make relatively little demands on the time and effort of medical staff or patient. We report findings with recall/precision/area under the receiver operator curve after application of CART, Random Forest, Gradient Boosting, and Support Vector Machines, Our results show (i) Random Forests and Gradient Boosting work very well with such data, (ii) Prediction quality when applied to relatively easily obtained measurements (Cognitive scores, Genetic Risk and plasma biomarkers) achieve results that are competitive with magnetic resonance techniques. This is by no means an exhaustive study, but instead an exploration of the plausibility of defining a series of relatively inexpensive, broad population based tests.
AB - The Alzheimer’s Disease Neuroimaging (ADNI) database is an expansive undertaking by government, academia, and industry to pool resources and data on subjects at various stage of symptomatic severity due to Alzheimer’s disease. As expected, magnetic resonance imaging is a major component of the project. Full brain images are obtained at every 6-month visit. A range of cognitive tests studying executive function and memory are employed less frequently. Two blood draws (baseline, 6 months) provide samples to measure concentrations of approximately 145 plasma biomarkers. In addition, other diagnostic measurements are performed including PET imaging, cerebral spinal fluid measurements of amyloid-beta and tau peptides, as well as genetic tests, demographics, and vital signs. ADNI data is available upon review of an application. There have been numerous reports of how various processes evolve during AD progression, including alterations in metabolic and neuroendocrine activity, cell survival, and cognitive behavior. Lacking an analytic model at the onset, we leveraged recent advances in machine learning, which allow us to deal with large, non-linear systems with many variables. Of particular note was examining how well binary predictions of future disease states could be learned from simple, non-invasive measurements like those dependent on blood samples. Such measurements make relatively little demands on the time and effort of medical staff or patient. We report findings with recall/precision/area under the receiver operator curve after application of CART, Random Forest, Gradient Boosting, and Support Vector Machines, Our results show (i) Random Forests and Gradient Boosting work very well with such data, (ii) Prediction quality when applied to relatively easily obtained measurements (Cognitive scores, Genetic Risk and plasma biomarkers) achieve results that are competitive with magnetic resonance techniques. This is by no means an exhaustive study, but instead an exploration of the plausibility of defining a series of relatively inexpensive, broad population based tests.
KW - Alzheimer Disease/diagnosis
KW - Apolipoprotein A-V/blood
KW - Area Under Curve
KW - Biomarkers/blood
KW - Brain/diagnostic imaging
KW - Databases, Factual
KW - Disease Progression
KW - Humans
KW - Machine Learning
KW - Magnetic Resonance Imaging
KW - Neuroimaging/methods
KW - Principal Component Analysis
KW - ROC Curve
UR - http://www.scopus.com/inward/record.url?scp=85088811974&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85088811974&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0235663
DO - 10.1371/journal.pone.0235663
M3 - Article
C2 - 32716914
AN - SCOPUS:85088811974
SN - 1932-6203
VL - 15
JO - PloS one
JF - PloS one
IS - 7
M1 - e0235663
ER -