TY - JOUR
T1 - Health system-scale language models are all-purpose prediction engines
AU - Jiang, Lavender Yao
AU - Liu, Xujin Chris
AU - Nejatian, Nima Pour
AU - Nasir-Moin, Mustafa
AU - Wang, Duo
AU - Abidin, Anas
AU - Eaton, Kevin
AU - Riina, Howard Antony
AU - Laufer, Ilya
AU - Punjabi, Paawan
AU - Miceli, Madeline
AU - Kim, Nora C.
AU - Orillac, Cordelia
AU - Schnurman, Zane
AU - Livia, Christopher
AU - Weiss, Hannah
AU - Kurland, David
AU - Neifert, Sean
AU - Dastagirzada, Yosef
AU - Kondziolka, Douglas
AU - Cheung, Alexander T.M.
AU - Yang, Grace
AU - Cao, Ming
AU - Flores, Mona
AU - Costa, Anthony B.
AU - Aphinyanaphongs, Yindalon
AU - Cho, Kyunghyun
AU - Oermann, Eric Karl
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/7/13
Y1 - 2023/7/13
N2 - Physicians make critical time-constrained decisions every day. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data-based clinical predictive models have limited use in everyday practice owing to complexity in data processing, as well as model development and deployment1–3. Here we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing4,5 to train a large language model for medical language (NYUTron) and subsequently fine-tune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an area under the curve (AUC) of 78.7–94.9%, with an improvement of 5.36–14.7% in the AUC compared with traditional models. We additionally demonstrate the benefits of pretraining with clinical text, the potential for increasing generalizability to different sites through fine-tuning and the full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.
AB - Physicians make critical time-constrained decisions every day. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data-based clinical predictive models have limited use in everyday practice owing to complexity in data processing, as well as model development and deployment1–3. Here we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing4,5 to train a large language model for medical language (NYUTron) and subsequently fine-tune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an area under the curve (AUC) of 78.7–94.9%, with an improvement of 5.36–14.7% in the AUC compared with traditional models. We additionally demonstrate the benefits of pretraining with clinical text, the potential for increasing generalizability to different sites through fine-tuning and the full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.
UR - http://www.scopus.com/inward/record.url?scp=85161393530&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161393530&partnerID=8YFLogxK
U2 - 10.1038/s41586-023-06160-y
DO - 10.1038/s41586-023-06160-y
M3 - Article
C2 - 37286606
AN - SCOPUS:85161393530
SN - 0028-0836
VL - 619
SP - 357
EP - 362
JO - Nature
JF - Nature
IS - 7969
ER -