分享

LocCa: Visual Pretraining with Location-aware Captioners

热度