Making the Most Out of the Limited Context Length: Predictive Power Varies with Clinical Note Type and Note Section

Hongyi Zheng, Yixin Tracy Zhu, Lavender Yao Jiang, Kyunghyun Cho, Eric Karl Oermann

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent advances in large language models have led to renewed interest in natural language processing in healthcare using the free text of clinical notes. One distinguishing characteristic of clinical notes is their long time span over multiple long documents. The unique structure of clinical notes creates a new design choice: when the context length for a language model predictor is limited, which part of clinical notes should we choose as the input? Existing studies either choose the inputs with domain knowledge or simply truncate them. We propose a framework to analyze the sections with high predictive power. Using MIMIC-III, we show that: 1) predictive power distribution is different between nursing notes and discharge notes and 2) combining different types of notes could improve performance when the context length is large. Our findings suggest that a carefully selected sampling function could enable more efficient information extraction from clinical notes.

Original languageEnglish (US)
Title of host publicationStudent Research Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages104-108
Number of pages5
ISBN (Electronic)9781959429692
StatePublished - 2023
Event61st Annual Meeting of the Association for Computational Linguistics, ACL-SRW 2023 - Toronto, Canada
Duration: Jul 10 2023Jul 12 2023

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume4
ISSN (Print)0736-587X

Conference

Conference61st Annual Meeting of the Association for Computational Linguistics, ACL-SRW 2023
Country/TerritoryCanada
CityToronto
Period7/10/237/12/23

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Making the Most Out of the Limited Context Length: Predictive Power Varies with Clinical Note Type and Note Section'. Together they form a unique fingerprint.

Cite this