[CV] Beyond SOT: It's Time to Track Multiple Generic Objects at Once
C Mayer, M Danelljan, M Yang, V Ferrari, L V Gool, A Kuznetsova
[Google Research & ETH Zurich]
超越SOT: 一次跟踪多个通用目标
引入一个新的大规模GOT基准LaGOT,每个序列包含多个标注目标对象; -
提出一种基于Transformer的GOT追踪器TaMOS,通过共享计算联合追踪多个对象; -
Generic Object Tracking (GOT) is the problem of tracking target objects, specified by bounding boxes in the first frame of a video. While the task has received much attention in the last decades, researchers have almost exclusively focused on the single object setting. Multi-object GOT benefits from a wider applicability, rendering it more attractive in real-world applications. We attribute the lack of research interest into this problem to the absence of suitable benchmarks. In this work, we introduce a new large-scale GOT benchmark, LaGOT, containing multiple annotated target objects per sequence. Our benchmark allows researchers to tackle key remaining challenges in GOT, aiming to increase robustness and reduce computation through joint tracking of multiple objects simultaneously. Furthermore, we propose a Transformer-based GOT tracker TaMOS capable of joint processing of multiple objects through shared computation. TaMOs achieves a 4x faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark. Finally, TaMOs achieves highly competitive results on single-object GOT datasets, setting a new state-of-the-art on TrackingNet with a success rate AUC of 84.4%. Our benchmark, code, and trained models will be made publicly available.