How Fair are Medical Imaging Foundation Models?

Muhammad Osama Khan, Muhammad Muneeb Afzal, Shujaat Mirza, Yi Fang

Research output: Contribution to journalConference articlepeer-review


While medical imaging foundation models have led to significant improvements across various tasks, the pivotal issue of subgroup fairness in these foundation models has remained largely unexplored. Our work bridges this research gap by presenting the first comprehensive study analyzing the subgroup fairness of six diverse foundation models, encompassing various pre-training methods, sources of pre-training data, and model architectures. In doing so, we discover a concerning trade-off: foundation models pre-trained on medical images achieve better overall performance but are consistently less fair than those pre-trained on natural images, with sometimes even worse fairness than baseline models trained from scratch. To mitigate these fairness disparities, we show that augmenting both the volume of pre-training data as well as the number of pre-training epochs, enhances subgroup fairness of medical imaging pre-trained models. Furthermore, to decouple the fairness bias from the pre-training and fine-tuning stages, we employ balanced datasets for fine-tuning. While fine-tuning on balanced datasets partially mitigates fairness issues, it is insufficient to completely eliminate the biases from the pre-training stage, prompting the need for careful design and evaluation of medical imaging foundation models. Our granular analysis reveals that medical imaging pre-trained models tend to favor majority racial subgroups (White, Asian) whereas natural imaging pre-trained models tend to favor minority racial subgroups (Black). Additionally, across all foundation models, we observe a consistent underperformance on the female patients cohort. As the community moves towards designing specialized foundation models for medical imaging, we hope our timely research provides crucial insights to help inform more equitable model development.

Original languageEnglish (US)
Pages (from-to)217-231
Number of pages15
JournalProceedings of Machine Learning Research
StatePublished - 2023
Event3rd Machine Learning for Health Symposium, ML4H 2023 - New Orleans, United States
Duration: Dec 10 2023 → …


  • fairness
  • Foundation models
  • self-supervised learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability


Dive into the research topics of 'How Fair are Medical Imaging Foundation Models?'. Together they form a unique fingerprint.

Cite this