TY - JOUR
T1 - A concept-wide association study of clinical notes to discover new predictors of kidney failure
AU - Singh, Karandeep
AU - Betensky, Rebecca A.
AU - Wright, Adam
AU - Curhan, Gary C.
AU - Bates, David W.
AU - Waikar, Sushrut S.
N1 - Funding Information:
Because Dr. Curhan is the Editor-in-Chief of CJASN, he was not involved in the peer-review process for this manuscript. Another editor over saw the peer-review and decision-making process for this manuscript. This research was supported, in part, by a NIHNational Institutes of Health T32 training grant awarded to the Division of Renal Medicine at Brigham and Women’s Hospital. The funding source had no role in the study design, conduct, analysis, or decision to submit the manuscript.
Publisher Copyright:
© 2016 by the American Society of Nephrology.
PY - 2016
Y1 - 2016
N2 - Background and objectives Identifying predictors of kidney disease progression is critical toward the development of strategies to prevent kidney failure. Clinical notes provide a unique opportunity for big data approaches to identify novel risk factors for disease. Design, setting, participants, &measurements Weusednatural language processing tools to extract concepts from the preceding year’s clinical notes among patients newly referred to a tertiary care center’s outpatient nephrology clinics and retrospectively evaluated these concepts as predictors for the subsequent development of ESRD using proportional subdistribution hazards (competing risk) regression. The primary outcome was time to ESRD, accounting for a competing risk of death. We identified predictors from univariate and multivariate (adjusting for Tangri linear predictor) models using a5%threshold for falsediscovery rate (q value, 0.05).Weincluded allpatients seen by an adult outpatient nephrologist between January 1, 2004 and June 18, 2014 and excluded patients seen only by transplant nephrology, with preexisting ESRD, with fewer than five clinical notes, with no follow-up, or with no baseline creatinine values. Results Among the 4013 patients selected in the final study cohort, we identified 960 concepts in the unadjusted analysis and 885 concepts in the adjusted analysis. Novel predictors identified included high-dose ascorbic acid (adjusted hazard ratio, 5.48; 95%confidence interval, 2.80 to 10.70; q, 0.001) and fast food (adjusted hazard ratio, 4.34; 95% confidence interval, 2.55 to 7.40; q, 0.001). Conclusions Novel predictors of human disease may be identified using an unbiased approach to analyze text from the electronic health record.
AB - Background and objectives Identifying predictors of kidney disease progression is critical toward the development of strategies to prevent kidney failure. Clinical notes provide a unique opportunity for big data approaches to identify novel risk factors for disease. Design, setting, participants, &measurements Weusednatural language processing tools to extract concepts from the preceding year’s clinical notes among patients newly referred to a tertiary care center’s outpatient nephrology clinics and retrospectively evaluated these concepts as predictors for the subsequent development of ESRD using proportional subdistribution hazards (competing risk) regression. The primary outcome was time to ESRD, accounting for a competing risk of death. We identified predictors from univariate and multivariate (adjusting for Tangri linear predictor) models using a5%threshold for falsediscovery rate (q value, 0.05).Weincluded allpatients seen by an adult outpatient nephrologist between January 1, 2004 and June 18, 2014 and excluded patients seen only by transplant nephrology, with preexisting ESRD, with fewer than five clinical notes, with no follow-up, or with no baseline creatinine values. Results Among the 4013 patients selected in the final study cohort, we identified 960 concepts in the unadjusted analysis and 885 concepts in the adjusted analysis. Novel predictors identified included high-dose ascorbic acid (adjusted hazard ratio, 5.48; 95%confidence interval, 2.80 to 10.70; q, 0.001) and fast food (adjusted hazard ratio, 4.34; 95% confidence interval, 2.55 to 7.40; q, 0.001). Conclusions Novel predictors of human disease may be identified using an unbiased approach to analyze text from the electronic health record.
KW - Adult
KW - Ascorbic Acid
KW - Chronic kidney disease
KW - Cohort Studies
KW - Creatinine
KW - Disease Progression
KW - Electronic Health Records
KW - Electronic health record
KW - End stage kidney disease
KW - Fast foods
KW - Follow-Up Studies
KW - Humans
KW - Informatics
KW - Kidney
KW - Kidney Diseases
KW - Kidney Failure, Chronic
KW - Natural Language Processing
KW - Natural language processing
KW - Nephrology
KW - Outpatients
KW - Renal Insufficiency
KW - Retrospective Studies
KW - Risk factors
KW - Tertiary Care Centers
UR - http://www.scopus.com/inward/record.url?scp=85021740923&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85021740923&partnerID=8YFLogxK
U2 - 10.2215/CJN.02420316
DO - 10.2215/CJN.02420316
M3 - Article
C2 - 27927892
AN - SCOPUS:85021740923
SN - 1555-9041
VL - 11
SP - 2150
EP - 2158
JO - Clinical Journal of the American Society of Nephrology
JF - Clinical Journal of the American Society of Nephrology
IS - 12
ER -