来自今天的爱可可AI前沿推介
[CV] Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
S Wang, C Saharia, C Montgomery, J Pont-Tuset, S Noy, S Pellegrini, Y Onoe, S Laszlo, D J. Fleet, R Soricut...
[Google Research]
Imagen编辑器和EditBench:文本引导图像补齐的推进与评估
要点:
-
Imagen Editor是一种级联扩散模型,可在文本引导图像补全上微调,用目标检测器在训练期间提出补全掩码; -
EditBench是一种系统的文本引导图像补全基准,可对自然图像和生成图像的补全编辑进行细粒度评估,探索对象、属性和场景; -
EditBench上的人工评估表明,训练期间的目标掩码可以改善文本图像对齐,当前模型比文本渲染更擅长对象渲染。
摘要:
文本引导图像编辑可在支持创意应用方面产生变革性影响。一个关键的挑战是生成忠实于输入文本提示的编辑,同时与输入图像保持一致。本文提出Imagen编辑器,一种通过在文本引导图像补全上微调Imagen构建的级联扩散模型。Imagen编辑器的编辑忠实于文本提示,这是通过在训练期间使用目标检测器提出补全掩码来完成的。此外,图像编辑器通过调节原始高分辨率图像上的级联管道来捕获输入图像中的精细细节。为了改进定性和定量评估,引入了EditBench,文本引导图像补全的系统基准。EditBench评估自然和生成图像的补全编辑,探索对象、属性和场景。通过对EditBench的广泛人工评估,发现训练期间的目标掩码导致文本图像对齐的全面改进——例如,图像编辑器优于DALL-E 2和Stable Diffusion——作为一个队列,这些模型更擅长目标渲染而不是文本渲染,并且比计数/形状属性更好地处理材料/颜色/大小属性。
Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a cascaded diffusion model built, by fine-tuning Imagen on text-guided image inpainting. Imagen Editor's edits are faithful to the text prompts, which is accomplished by using object detectors to propose inpainting masks during training. In addition, Imagen Editor captures fine details in the input image by conditioning the cascaded pipeline on the original high resolution image. To improve qualitative and quantitative evaluation, we introduce EditBench, a systematic benchmark for text-guided image inpainting. EditBench evaluates inpainting edits on natural and generated images exploring objects, attributes, and scenes. Through extensive human evaluation on EditBench, we find that object-masking during training leads to across-the-board improvements in text-image alignment -- such that Imagen Editor is preferred over DALL-E 2 and Stable Diffusion -- and, as a cohort, these models are better at object-rendering than text-rendering, and handle material/color/size attributes better than count/shape attributes.
内容中包含的图片若涉及版权问题,请及时与我们联系删除





评论
沙发等你来抢