清华大学 | 扩散SDF：基于体素化扩散的文本到形状生成

来自今天的爱可可AI前沿推介

[CV] Diffusion-SDF: Text-to-Shape via Voxelized Diffusion

M Li, Y Duan, J Zhou, J Lu
[Tsinghua University]

扩散SDF：基于体素化扩散的文本到形状生成

简介：提出一种新的生成式3D建模框架Diffusion-SDF，用于文本到形状合成任务。该框架由一个采用UinU-Net架构的SDF自编码器和一个体素化扩散模型组成，用于3D形状的体素化有符号距离场(SDF)表示的学习和生成。实验表明，扩散SDF能生成高质量、多样化的3D形状，精确地符合文本描述，并优于现有的文本到形状方法。

摘要：随着工业界对3D虚拟建模技术的关注度不断提高，基于指定条件(如文本)生成新的3D内容已成为一个热点问题。本文提出一种新的生成式3D建模框架Diffusion-SDF，用于文本到形状合成的挑战性任务。之前的方法在3D数据表示和形状生成方面都缺乏灵活性，因此无法生成符合给定文本描述的高度多样化的3D形状。为解决该问题，本文提出一种SDF自动编码器和体素化扩散模型来学习和生成3D形状的体素化有符号距离场(SDF)表示。设计了一种新的UinU-Net架构，在标准U-Net架构中植入了一个以局部为重点的内部网络，能更好地重建与图块无关的SDF表示。将该方法扩展到进一步的文本到形状的任务，包括文本条件的形状补全和操纵。实验结果表明，扩散SDF能生成高质量和高度多样化的3D形状，并很好地符合给定的文本描述。与以前最先进的文本到形状的方法相比，Diffusion-SDF证明了其优越性。

With the rising industrial attention to 3D virtual modeling technology, generating novel 3D content based on specified conditions (e.g. text) has become a hot issue. In this paper, we propose a new generative 3D modeling framework called Diffusion-SDF for the challenging task of text-to-shape synthesis. Previous approaches lack flexibility in both 3D data representation and shape generation, thereby failing to generate highly diversified 3D shapes conforming to the given text descriptions. To address this, we propose a SDF autoencoder together with the Voxelized Diffusion model to learn and generate representations for voxelized signed distance fields (SDFs) of 3D shapes. Specifically, we design a novel UinU-Net architecture that implants a local-focused inner network inside the standard U-Net architecture, which enables better reconstruction of patch-independent SDF representations. We extend our approach to further text-to-shape tasks including text-conditioned shape completion and manipulation. Experimental results show that Diffusion-SDF is capable of generating both high-quality and highly diversified 3D shapes that conform well to the given text descriptions. Diffusion-SDF has demonstrated its superiority compared to previous state-of-the-art text-to-shape approaches.

论文链接：https://arxiv.org/abs/2212.03293

内容中包含的图片若涉及版权问题，请及时与我们联系删除

清华大学 | 扩散SDF：基于体素化扩散的文本到形状生成

[CV] Diffusion-SDF: Text-to-Shape via Voxelized Diffusion

评论