Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation

简介

随着对多功能机器人系统在各种不同和动态环境中操作需求的增加，强调了通用政策的重要性，该政策利用大量的跨体数据语料库以促进广泛的适应性和高层次的推理。然而，通用政策在推理效率和训练成本方面存在问题。相反，专家政策是针对特定领域数据进行策划的，并以效率为目标在任务级别上表现出色。然而，它缺乏广泛应用的泛化能力。受这些观察的启发，我们介绍了RoboDual，这是一个协同的双重系统，它补充了通用政策和专家政策的优点。基于扩散变压器的专家被设计用于多步骤操作推出，精细地调节高层次任务理解和基于视觉语言行动（VLA）的通用政策的离散化行动输出。与OpenVLA相比，RoboDual在现实世界中的表现提高了26.7％，在CALVIN上提高了12％，只需引入具有2000万可训练参数的专家政策。它仅使用5％的演示数据就能保持强大的性能，并在实际部署中实现3.8倍的更高控制频率。代码将公开发布。我们的项目页面托管在：https://opendrivelab.com/RoboDual/。
作者讲解

目前尚无作者解读视频，你可点击下方【许愿开讲】按钮，许愿作者开讲~
图表
解决问题

RoboDual: A Synergistic Dual-System for Generalist and Specialist Policies
关键思路

RoboDual proposes a dual-system approach that combines the benefits of generalist and specialist policies in robotic systems, using a diffusion transformer-based specialist and a vision-language-action (VLA) based generalist.
其它亮点

RoboDual achieves a 26.7% improvement in real-world settings and a 12% gain on CALVIN, with only 20M trainable parameters. It maintains strong performance with only 5% of demonstration data and enables a 3.8 times higher control frequency in real-world deployment. Code is publicly available.
相关研究

Related work includes OpenVLA, which is compared to RoboDual in the experiments.

Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation

提问交流

提问交流