分享

Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models

热度