- 简介基础模型在医疗保健领域的出现为通过自动分类和分割任务增强医学诊断提供了前所未有的机会。然而,这些模型在应用于医疗保健应用程序中的多样化和少数群体时,也引发了重大的公平性问题。目前,缺乏全面的基准测试、标准化的管道和易于适应的库来评估和理解基础模型在医学成像中的公平性表现,这导致在制定和实施确保不同患者人群之间公平结果的解决方案方面存在相当大的挑战。为了填补这一空白,我们推出了FairMedFM,这是一个基础模型在医学成像研究中的公平性基准。FairMedFM与17个流行的医学成像数据集集成,涵盖不同的模态、维度和敏感属性。它探索了20种广泛使用的基础模型,包括零样本学习、线性探针、参数高效微调和在不同下游任务中的提示 - 分类和分割。我们的详尽分析从多个角度评估了不同评估指标的公平性表现,揭示了存在的偏见、不同基础模型之间的效用公平性权衡、在同一数据集上存在的一致差异,以及现有的不公平性缓解方法的有限效果。请查看FairMedFM的项目页面和开源代码库,它支持可扩展的功能和应用程序,以及长期针对医学成像中基础模型的包容性研究。
- 图表
- 解决问题FairMedFM: A Fairness Benchmark for Medical Imaging with 20 Foundation Models
- 关键思路The paper introduces FairMedFM, a comprehensive benchmark for evaluating the fairness performance of foundation models (FMs) in medical imaging. It integrates with 17 popular medical imaging datasets and explores 20 widely used FMs, evaluating their fairness performance over different evaluation metrics from multiple perspectives.
- 其它亮点The paper reveals the existence of bias, varied utility-fairness trade-offs on different FMs, consistent disparities on the same datasets regardless FMs, and limited effectiveness of existing unfairness mitigation methods. FairMedFM's project page and open-sourced codebase support extendible functionalities and applications as well as inclusive for studies on FMs in medical imaging over the long term.
- Related work includes fairness in machine learning, benchmarking in medical imaging, and foundation models in healthcare. Some relevant papers include 'On the Measurement of Social Biases in Speech Recognition', 'Benchmarking Frameworks for Deep Learning in Medical Image Analysis', and 'Foundation Models in Healthcare: Opportunities and Challenges'.
沙发等你来抢
去评论
评论
沙发等你来抢