TY - JOUR
T1 - Enhanced family history-based algorithms increase the identification of individuals meeting criteria for genetic testing of hereditary cancer syndromes but would not reduce disparities on their own
AU - Bradshaw, Richard L.
AU - Kawamoto, Kensaku
AU - Bather, Jemar R.
AU - Goodman, Melody S.
AU - Kohlmann, Wendy K.
AU - Chavez-Yenter, Daniel
AU - Volkmar, Molly
AU - Monahan, Rachel
AU - Kaphingst, Kimberly A.
AU - Del Fiol, Guilherme
N1 - Publisher Copyright:
© 2023 The Authors
PY - 2024/1
Y1 - 2024/1
N2 - Objective: This study aimed to 1) investigate algorithm enhancements for identifying patients eligible for genetic testing of hereditary cancer syndromes using family history data from electronic health records (EHRs); and 2) assess their impact on relative differences across sex, race, ethnicity, and language preference. Materials and Methods: The study used EHR data from a tertiary academic medical center. A baseline rule-base algorithm, relying on structured family history data (structured data; SD), was enhanced using a natural language processing (NLP) component and a relaxed criteria algorithm (partial match [PM]). The identification rates and differences were analyzed considering sex, race, ethnicity, and language preference. Results: Among 120,007 patients aged 25–60, detection rate differences were found across all groups using the SD (all P < 0.001). Both enhancements increased identification rates; NLP led to a 1.9 % increase and the relaxed criteria algorithm (PM) led to an 18.5 % increase (both P < 0.001). Combining SD with NLP and PM yielded a 20.4 % increase (P < 0.001). Similar increases were observed within subgroups. Relative differences persisted across most categories for the enhanced algorithms, with disproportionately higher identification of patients who are White, Female, non-Hispanic, and whose preferred language is English. Conclusion: Algorithm enhancements increased identification rates for patients eligible for genetic testing of hereditary cancer syndromes, regardless of sex, race, ethnicity, and language preference. However, differences in identification rates persisted, emphasizing the need for additional strategies to reduce disparities such as addressing underlying biases in EHR family health information and selectively applying algorithm enhancements for disadvantaged populations. Systematic assessment of differences in algorithm performance across population subgroups should be incorporated into algorithm development processes.
AB - Objective: This study aimed to 1) investigate algorithm enhancements for identifying patients eligible for genetic testing of hereditary cancer syndromes using family history data from electronic health records (EHRs); and 2) assess their impact on relative differences across sex, race, ethnicity, and language preference. Materials and Methods: The study used EHR data from a tertiary academic medical center. A baseline rule-base algorithm, relying on structured family history data (structured data; SD), was enhanced using a natural language processing (NLP) component and a relaxed criteria algorithm (partial match [PM]). The identification rates and differences were analyzed considering sex, race, ethnicity, and language preference. Results: Among 120,007 patients aged 25–60, detection rate differences were found across all groups using the SD (all P < 0.001). Both enhancements increased identification rates; NLP led to a 1.9 % increase and the relaxed criteria algorithm (PM) led to an 18.5 % increase (both P < 0.001). Combining SD with NLP and PM yielded a 20.4 % increase (P < 0.001). Similar increases were observed within subgroups. Relative differences persisted across most categories for the enhanced algorithms, with disproportionately higher identification of patients who are White, Female, non-Hispanic, and whose preferred language is English. Conclusion: Algorithm enhancements increased identification rates for patients eligible for genetic testing of hereditary cancer syndromes, regardless of sex, race, ethnicity, and language preference. However, differences in identification rates persisted, emphasizing the need for additional strategies to reduce disparities such as addressing underlying biases in EHR family health information and selectively applying algorithm enhancements for disadvantaged populations. Systematic assessment of differences in algorithm performance across population subgroups should be incorporated into algorithm development processes.
KW - Algorithm development
KW - Electronic health records
KW - Genetic testing
KW - Healthcare disparities
KW - Hereditary cancer syndromes
UR - http://www.scopus.com/inward/record.url?scp=85179852630&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85179852630&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2023.104568
DO - 10.1016/j.jbi.2023.104568
M3 - Article
C2 - 38081564
AN - SCOPUS:85179852630
SN - 1532-0464
VL - 149
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
M1 - 104568
ER -