Analyzing historical diagnosis code data from NIH N3C and RECOVER Programs using deep learning to determine risk factors for Long Covid

Saurav Sengupta, Johanna Loomba, Suchetha Sharma, Donald E. Brown, Lorna Thorpe, Melissa A. Haendel, Christopher G. Chute, Stephanie Hong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Post-acute sequelae of SARS-CoV-2 infection (PASC) or Long COVID is an emerging medical condition that has been observed in several patients with a positive diagnosis for COVID-19. Historical Electronic Health Records (EHR) like diagnosis codes, lab results and clinical notes have been analyzed using deep learning and have been used to predict future clinical events. In this paper, we propose an interpretable deep learning approach to analyze historical diagnosis code data from the National COVID Cohort Collective (N3C)1 to find the risk factors contributing to developing Long COVID. Using our deep learning approach, we are able to predict if a patient is suffering from Long COVID from a temporally ordered list of diagnosis codes up to 45 days post the first COVID positive test or diagnosis for each patient, with an accuracy of 70.48%. We are then able to examine the trained model using Gradient-weighted Class Activation Mapping (GradCAM) to give each input diagnoses a score. The highest scored diagnosis were deemed to be the most important for making the correct prediction for a patient. We also propose a way to summarize these top diagnoses for each patient in our cohort and look at their temporal trends to determine which codes contribute towards a positive Long COVID diagnosis.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
EditorsDonald Adjeroh, Qi Long, Xinghua Shi, Fei Guo, Xiaohua Hu, Srinivas Aluru, Giri Narasimhan, Jianxin Wang, Mingon Kang, Ananda M. Mondal, Jin Liu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2797-2802
Number of pages6
ISBN (Electronic)9781665468190
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022 - Las Vegas, United States
Duration: Dec 6 2022Dec 8 2022

Publication series

NameProceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022

Conference

Conference2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
Country/TerritoryUnited States
CityLas Vegas
Period12/6/2212/8/22

Keywords

  • COVID-19
  • EHR
  • GradCAM
  • deep learning

ASJC Scopus subject areas

  • Psychiatry and Mental health
  • Information Systems and Management
  • Biomedical Engineering
  • Medicine (miscellaneous)
  • Cardiology and Cardiovascular Medicine
  • Health Informatics

Fingerprint

Dive into the research topics of 'Analyzing historical diagnosis code data from NIH N3C and RECOVER Programs using deep learning to determine risk factors for Long Covid'. Together they form a unique fingerprint.

Cite this