来自今天的爱可可AI前沿推介

1、[CV] Point-E: A System for Generating 3D Point Clouds from Complex Prompts

A Nichol, H Jun, P Dhariwal, P Mishkin, M Chen
[OpenAI]

Point-E: 复杂提示3D点云生成系统

要点:

  1. 提出Point-E,可以在单个GPU上1-2分钟内生成3D模型;
  2. 用两个扩散模型、结合文本到图像和图像到3D模型的优势,从文本提示高效生成3D点云模型。

摘要: 虽然最近文本条件3D物体生成方面的工作显示了有希望的结果,但最先进的方法通常需要好多个GPU小时来生成一个样本。这与最先进的生成式图像模型形成了鲜明的对比,后者在数秒或数分钟内就能生成样本。本文探索了一种替代性3D物体生成方法,该方法在单个GPU上只需1-2分钟就能生成3D模型。所提出方法首先使用文本到图像的扩散模型生成一个单一的合成视图,然后使用第二个扩散模型生成一个3D点云,该模型以生成图像为条件。虽然该方法在采样质量方面仍未达到最先进的水平,但采样速度要快一到两个数量级,为一些使用情况提供了实际的权衡。

While recent work on text-conditional 3D object generation has shown promising results, the state-of-the-art methods typically require multiple GPU-hours to produce a single sample. This is in stark contrast to state-of-the-art generative image models, which produce samples in a number of seconds or minutes. In this paper, we explore an alternative method for 3D object generation which produces 3D models in only 1-2 minutes on a single GPU. Our method first generates a single synthetic view using a text-to-image diffusion model, and then produces a 3D point cloud using a second diffusion model which conditions on the generated image. While our method still falls short of the state-of-the-art in terms of sample quality, it is one to two orders of magnitude faster to sample from, offering a practical trade-off for some use cases. We release our pre-trained point cloud diffusion models, as well as evaluation code and models, at this https URL.

论文链接:https://arxiv.org/abs/2212.08751
图片
图片
图片
图片