分享

Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following

热度