YOWOv3: An Efficient and Generalized Framework for Human Action Detection and Recognition

2024年08月05日
  • 简介
    本文提出了一个名为YOWOv3的新框架,它是YOWOv2的改进版本,专门设计用于人类动作检测和识别任务。该框架旨在促进对不同配置的广泛实验,并支持对模型中各个组件进行简单的定制,从而减少了理解和修改代码所需的工作量。与UCF101-24和AVAv2.2这两个广泛使用的人类动作检测和识别数据集相比,YOWOv3展示了比YOWOv2更优越的性能。具体而言,前任模型YOWOv2在UCF101-24和AVAv2.2上分别实现了85.2%和20.3%的mAP,参数为109.7M,GFLOPs为53.6。相比之下,我们的模型——YOWOv3,只有59.8M的参数和39.8 GFLOPs,分别在UCF101-24和AVAv2.2上实现了88.33%和20.31%的mAP。结果表明,YOWOv3显着减少了参数和GFLOPs的数量,同时仍然实现了可比较的性能。
  • 图表
  • 解决问题
    YOWOv3: An Improved Framework for Human Action Detection and Recognition
  • 关键思路
    YOWOv3 is a framework designed for Human Action Detection and Recognition, which demonstrates superior performance compared to its predecessor YOWOv2 while significantly reducing the number of parameters and GFLOPs.
  • 其它亮点
    YOWOv3 is designed to facilitate extensive experimentation with different configurations and supports easy customization of various components within the model. It achieves an mAP of 88.33% and 20.31% on UCF101-24 and AVAv2.2, respectively, with only 59.8M parameters and 39.8 GFLOPs. The framework is evaluated on two widely used datasets and the results show its superior performance compared to YOWOv2.
  • 相关研究
    Related works in this field include 'Real-time Action Recognition with Enhanced Motion Vector CNNs' and 'Two-Stream Convolutional Networks for Action Recognition in Videos'.
PDF
原文
点赞 收藏 评论 分享到Link

沙发等你来抢

去评论