TY - JOUR
T1 - MedFuse
T2 - 7th Machine Learning for Healthcare Conference, MLHC 2022
AU - Hayat, Nasir
AU - Geras, Krzysztof J.
AU - Shamout, Farah E.
N1 - Funding Information:
Acknowledgements. This work is supported in part by the NYUAD Center for Artificial Intelligence and Robotics, funded by Tamkeen under the NYUAD Research Institute Award CG010. We would also like to thank the High Performance Computing (HPC) team at NYUAD for their support.
Funding Information:
This work is supported in part by the NYUAD Center for Artificial Intelligence and Robotics, funded by Tamkeen under the NYUAD Research Institute Award CG010. We would also like to thank the High Performance Computing (HPC) team at NYUAD for their support.
Publisher Copyright:
© 2022 N. Hayat, K.J. Geras & F.E. Shamout.
PY - 2022
Y1 - 2022
N2 - Multi-modal fusion approaches aim to integrate information from different data sources. Unlike natural datasets, such as in audio-visual applications, where samples consist of “paired” modalities, data in healthcare is often collected asynchronously. Hence, requiring the presence of all modalities for a given sample is not realistic for clinical tasks and significantly limits the size of the dataset during training. In this paper, we propose MedFuse, a conceptually simple yet promising LSTM-based fusion module that can accommodate uni-modal as well as multi-modal input. We evaluate the fusion method and introduce new benchmark results for in-hospital mortality prediction and phenotype classification, using clinical time-series data in the MIMIC-IV dataset and corresponding chest X-ray images in MIMIC-CXR. Compared to more complex multi-modal fusion strategies, MedFuse provides a performance improvement by a large margin on the fully paired test set. It also remains robust across the partially paired test set containing samples with missing chest X-ray images. We release our code for reproducibility and to enable the evaluation of competing models in the future.
AB - Multi-modal fusion approaches aim to integrate information from different data sources. Unlike natural datasets, such as in audio-visual applications, where samples consist of “paired” modalities, data in healthcare is often collected asynchronously. Hence, requiring the presence of all modalities for a given sample is not realistic for clinical tasks and significantly limits the size of the dataset during training. In this paper, we propose MedFuse, a conceptually simple yet promising LSTM-based fusion module that can accommodate uni-modal as well as multi-modal input. We evaluate the fusion method and introduce new benchmark results for in-hospital mortality prediction and phenotype classification, using clinical time-series data in the MIMIC-IV dataset and corresponding chest X-ray images in MIMIC-CXR. Compared to more complex multi-modal fusion strategies, MedFuse provides a performance improvement by a large margin on the fully paired test set. It also remains robust across the partially paired test set containing samples with missing chest X-ray images. We release our code for reproducibility and to enable the evaluation of competing models in the future.
UR - http://www.scopus.com/inward/record.url?scp=85164539200&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85164539200&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85164539200
SN - 2640-3498
VL - 182
SP - 479
EP - 503
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
Y2 - 5 August 2022 through 6 August 2022
ER -