EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging

简介

人工智能在眼科诊断中扮演着至关重要的角色，可以处理诊断、分类和视觉问答等任务。然而，眼科领域中现有的人工智能模型通常需要大量注释，并且是面向特定任务的，这限制了它们的临床效用。虽然最近的发展带来了眼科基础模型，但由于需要为每种成像模式训练单独的权重，它们受到限制，不能全面表示多模态特征。这突显了眼科领域需要能够处理各种任务和模态的通用基础模型。为了填补这一空白，我们提出了EyeFound，这是一个用于眼科图像的多模态基础模型。与现有模型不同，EyeFound可以从未标记的多模态视网膜图像中学习可推广的表示，从而实现对多个应用程序的高效模型适应。在227家医院的11种眼科成像模式下训练了278万张图像的EyeFound，可以实现通用表示和多样化的多模态下游任务，甚至可以检测具有挑战性的罕见疾病。它在诊断眼疾、预测系统性疾病事件和零样本多模态VQA方面的表现优于之前的RETFound工作。EyeFound提供了一个可推广的解决方案，以提高模型性能并减轻专家的注释负担，从而促进视网膜成像的广泛临床人工智能应用。
作者讲解

目前尚无作者解读视频，你可点击下方【许愿开讲】按钮，许愿作者开讲~
图表
解决问题

EyeFound: A Multimodal Foundation Model for Ophthalmic Images
关键思路

EyeFound learns generalizable representations from unlabeled multimodal retinal images, enabling efficient model adaptation across multiple applications.
其它亮点

EyeFound is trained on 2.78 million images from 227 hospitals across 11 ophthalmic modalities, facilitating generalist representations and diverse multimodal downstream tasks. It outperforms previous work RETFound in diagnosing eye diseases, predicting systemic disease incidents, and zero-shot multimodal VQA.
相关研究

Related work includes RETFound, which is limited by the need to train separate weights for each imaging modality, and other task-specific AI models in ophthalmology.

EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging

提问交流

提问交流