Learning Humanoid Locomotion with Transformers

I Radosavovic, T Xiao, B Zhang, T Darrell, J Malik, K Sreenath
[UC Berkeley]

用 Transformer 学习仿人运动

要点:

  1. 一个基于模拟到现实学习的方法,使用 Transformer 模型用于现实世界的仿人运动;
  2. 通过从观察和行动的历史中对未来行动进行自回归预测,训练了一个因果 Transformer 模型;
  3. 该方法中没有使用状态估计、动力学模型、轨迹优化、参考轨迹或预计算步态库;
  4. 控制器是在模拟的随机环境集合上用大规模的无模型强化学习训练的,并以零样本方式部署到真实世界。

一句话总结:
在不使用状态估计、动力学模型、轨迹优化、参考轨迹或预计算步态库的情况下,一个使用 Transformer 模型的基于模拟到现实学习的方法被成功部署到现实世界的仿人运动中。

论文地址 https://arxiv.org/abs/2303.03381 

We present a sim-to-real learning-based approach for real-world humanoid locomotion. Our controller is a causal Transformer trained by autoregressive prediction of future actions from the history of observations and actions. We hypothesize that the observation-action history contains useful information about the world that a powerful Transformer model can use to adapt its behavior in-context, without updating its weights. We do not use state estimation, dynamics models, trajectory optimization, reference trajectories, or pre-computed gait libraries. Our controller is trained with large-scale model-free reinforcement learning on an ensemble of randomized environments in simulation and deployed to the real world in a zero-shot fashion. We evaluate our approach in high-fidelity simulation and successfully deploy it to the real robot as well. To the best of our knowledge, this is the first demonstration of a fully learning-based method for real-world full-sized humanoid locomotion.


图片
图片
图片

内容中包含的图片若涉及版权问题,请及时与我们联系删除