TY - GEN
T1 - What Is the Best Way to Fine-Tune Self-supervised Medical Imaging Models?
AU - Khan, Muhammad Osama
AU - Fang, Yi
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - In recent years, self-supervised learning (SSL) has enabled significant breakthroughs via training large foundation models. These self-supervised pre-trained models are typically utilized for downstream tasks via end-to-end fine-tuning. However, it remains unclear whether end-to-end fine-tuning is truly optimal for effectively leveraging the pre-trained knowledge. This is especially true considering the diverse categories of SSL that capture distinct features, potentially requiring varied fine-tuning approaches. To bridge this research gap, we present the first comprehensive study discovering optimal fine-tuning strategies for self-supervised learning in medical imaging. Firstly, we develop strong contrastive and restorative SSL baselines that outperform SOTA methods on four diverse downstream tasks. Next, we conduct an extensive fine-tuning analysis across multiple pre-training and fine-tuning datasets, as well as various fine-tuning dataset sizes. Contrary to the conventional wisdom of fine-tuning only the last few layers of a pre-trained network, we show that fine-tuning intermediate layers is much more effective. Specifically, fine-tuning the second quarter (25–50%) of the network is optimal for contrastive SSL whereas fine-tuning the third quarter (50–75%) of the network is optimal for restorative SSL. Moreover, compared to the de-facto standard of end-to-end fine-tuning, our best fine-tuning strategy, which fine-tunes a shallower network consisting of the first three quarters (0–75%) of the pre-trained network, yields improvements of as much as 5.48%. Additionally, using these insights, we propose a simple yet effective method to leverage the complementary strengths of multiple SSL models, resulting in enhancements of up to 3.57%. Given the rapid progress in SSL, we hope these fine-tuning techniques will significantly improve the utility of self-supervised medical imaging models.
AB - In recent years, self-supervised learning (SSL) has enabled significant breakthroughs via training large foundation models. These self-supervised pre-trained models are typically utilized for downstream tasks via end-to-end fine-tuning. However, it remains unclear whether end-to-end fine-tuning is truly optimal for effectively leveraging the pre-trained knowledge. This is especially true considering the diverse categories of SSL that capture distinct features, potentially requiring varied fine-tuning approaches. To bridge this research gap, we present the first comprehensive study discovering optimal fine-tuning strategies for self-supervised learning in medical imaging. Firstly, we develop strong contrastive and restorative SSL baselines that outperform SOTA methods on four diverse downstream tasks. Next, we conduct an extensive fine-tuning analysis across multiple pre-training and fine-tuning datasets, as well as various fine-tuning dataset sizes. Contrary to the conventional wisdom of fine-tuning only the last few layers of a pre-trained network, we show that fine-tuning intermediate layers is much more effective. Specifically, fine-tuning the second quarter (25–50%) of the network is optimal for contrastive SSL whereas fine-tuning the third quarter (50–75%) of the network is optimal for restorative SSL. Moreover, compared to the de-facto standard of end-to-end fine-tuning, our best fine-tuning strategy, which fine-tunes a shallower network consisting of the first three quarters (0–75%) of the pre-trained network, yields improvements of as much as 5.48%. Additionally, using these insights, we propose a simple yet effective method to leverage the complementary strengths of multiple SSL models, resulting in enhancements of up to 3.57%. Given the rapid progress in SSL, we hope these fine-tuning techniques will significantly improve the utility of self-supervised medical imaging models.
UR - http://www.scopus.com/inward/record.url?scp=85200692758&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200692758&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-66955-2_19
DO - 10.1007/978-3-031-66955-2_19
M3 - Conference contribution
AN - SCOPUS:85200692758
SN - 9783031669545
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 267
EP - 281
BT - Medical Image Understanding and Analysis - 28th Annual Conference, MIUA 2024, Proceedings
A2 - Yap, Moi Hoon
A2 - Kendrick, Connah
A2 - Behera, Ardhendu
A2 - Cootes, Timothy
A2 - Zwiggelaar, Reyer
PB - Springer Science and Business Media Deutschland GmbH
T2 - 28th Annual Conference on Medical Image Understanding and Analysis, MIUA 2024
Y2 - 24 July 2024 through 26 July 2024
ER -