这几天有个修图论文火了(Reddit上被顶了近千次),演示效果是这样的:
这是三星莫斯科AI中心的工作。
论文地址: Resolution-robust Large Mask Inpainting with Fourier Convolutions
代码: https://github.com/saic-mdal/lama
摘要
Modern image inpainting systems, despite the significant progress, often struggle with large missing areas, complex geometric structures, and high-resolution images. We find that one of the main reasons for that is the lack of an effective receptive field in both the inpainting network and the loss function. To alleviate this issue, we propose a new method called large mask inpainting (LaMa). LaMa is based on i) a new inpainting network architecture that uses fast Fourier convolutions, which have the image-wide receptive field; ii) a high receptive field perceptual loss; and iii) large training masks, which unlocks the potential of the first two components. Our inpainting network improves the state-of-the-art across a range of datasets and achieves excellent performance even in challenging scenarios, e.g. completion of periodic structures. Our model generalizes surprisingly well to resolutions that are higher than those seen at train time, and achieves this at lower parameter&compute costs than the competitive baselines.
现代图像修复系统,尽管取得了重大进展,但经常受困于大面积缺失区域、复杂几何结构和高分辨率图像。我们发现造成这种情况的主要原因之一是修复网络和损失函数都缺乏有效的感受野。为了缓解这个问题,我们提出了一种称为大蒙版修复(LaMa)的新方法。 LaMa 基于 i) 一种新的修复网络架构,该架构使用快速傅立叶卷积,具有图像范围的感受野; ii) 高感受野感知损失; iii) 大型训练掩码,它释放了前两个组件的潜力。我们的修复网络改进了一系列数据集的最新技术,即使在具有挑战性的场景中也能实现出色的性能,例如完成周期性结构。我们的模型非常好地推广到比训练时看到的分辨率更高的分辨率,并以比竞争基线更低的参数和计算成本实现这一目标。
值得一提的是,快速傅里叶卷积是北京大学穆亚东团队去年在NeurIPS上发表的工作: https://papers.nips.cc/paper/2020/hash/2fd5d41ec6cfab47e32164d5624269b1-Abstract.html 。
内容中包含的图片若涉及版权问题,请及时与我们联系删除
评论
沙发等你来抢