Story-to-Motion: Synthesizing Infinite and Controllable Character Animation from Long Text

2023年11月13日
  • 简介
    从故事中生成自然的人类动作具有改变动画、游戏和电影产业格局的潜力。当角色需要根据长篇文本描述移动到不同位置并执行特定动作时,就出现了一个新的具有挑战性的任务——Story-to-Motion。这个任务需要融合低级控制(轨迹)和高级控制(动作语义)。以往在角色控制和文本到动作方面的研究已经涉及相关方面,但是全面的解决方案仍然难以实现:角色控制方法无法处理文本描述,而文本到动作的方法缺乏位置约束并且经常产生不稳定的动作。鉴于这些限制,我们提出了一个新颖的系统,可以生成可控的、无限长的动作和与输入文本对齐的轨迹。(1)我们利用当代的大型语言模型作为文本驱动的动作调度器,从长篇文本中提取一系列(文本、位置、持续时间)三元组。 (2)我们开发了一种文本驱动的动作检索方案,将动作匹配与动作语义和轨迹约束相结合。 (3)我们设计了一个渐进式遮罩变换器,解决了过渡动作中常见的不自然姿势和足部滑动等问题。除了作为Story-to-Motion的第一个全面解决方案的开创性角色外,我们的系统还在轨迹跟随、时间动作组合和动作混合等三个不同的子任务上进行了评估,在各方面都优于以往最先进的动作合成方法。主页:https://story2motion.github.io/.
  • 作者讲解
  • 图表
  • 解决问题
    The paper aims to solve the problem of generating natural human motion from a story, which requires a fusion of low-level control (trajectories) and high-level control (motion semantics). This is a new and challenging task in the animation, gaming, and film industries.
  • 关键思路
    The key idea of the paper is to propose a novel system that generates controllable, infinitely long motions and trajectories aligned with the input text. This is achieved through leveraging contemporary Large Language Models to act as a text-driven motion scheduler, developing a text-driven motion retrieval scheme that incorporates motion matching with motion semantic and trajectory constraints, and designing a progressive mask transformer that addresses common artifacts in the transition motion such as unnatural pose and foot sliding.
  • 其它亮点
    The paper's system undergoes evaluation across three distinct sub-tasks: trajectory following, temporal action composition, and motion blending, where it outperforms previous state-of-the-art motion synthesis methods across the board. The paper also provides a comprehensive solution for Story-to-Motion, which was previously elusive. The paper's homepage provides access to datasets and code. The proposed system has the potential to transform the landscape of animation, gaming, and film industries.
  • 相关研究
    Recent related work in this field includes 'Text2Gestures: A Transformer-Based Framework for Generating Gestures from Text' and 'Text2Action: Generative Adversarial Synthesis from Language to Action'.
许愿开讲
PDF
原文
点赞 收藏
向作者提问
NEW
分享到Link

提问交流

提交问题,平台邀请作者,轻松获得权威解答~

向作者提问