来自今天的爱可可AI前沿推介。

[LG] Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models

J L. Watson, D Juergens, N R. Bennett…
[University of Washington]

结构预测网络与扩散生成模型相结合的普适精确蛋白质设计

简介:RoseTTAFold Diffusion(RFdiffusion)是一种蛋白质设计框架,结合了结构预测网络和扩散生成模型,为各种蛋白质设计挑战实现了最先进性能。与之前的方法(如深度学习技术)相比,RFdiffusion提供了更多的控制性和复杂性,可扩展到核酸、小分子结合蛋白和折叠规格。

摘要:用于蛋白质设计的深度学习方法在序列设计、功能位点支架、构建新单体、环状低聚物和抗体环路方面显示出相当大的前景。尽管取得了这些进展,一个能解决广泛的设计挑战的蛋白质设计一般框架还没有提出过,包括新的粘合剂设计和高阶对称结构的设计。扩散模型在图像和语言生成模型中取得了相当大的成功,并被应用于蛋白质单体生成问题,但成功率有限,这可能是由于蛋白质骨架几何和序列结构关系的复杂性。本文表明,通过利用强大的结构预测方法作为扩散去噪网络,可以利用它们学到的蛋白质表征。本文展示了在无条件和拓扑约束的蛋白质单体设计、蛋白质和肽结合物设计、对称低聚物设计、酶活性位点支架以及治疗性和金属结合蛋白设计的对称图案支架上的最先进的性能。通过对数百种新的设计进行实验,证明了这种名为RoseTTAFold Diffusion(RFdiffusion)的方法的能力和通用性。亮点包括一个皮摩尔甲状旁腺激素结合剂,比以前在实验优化之前的任何计算设计的结合剂的亲和力都要高得多,以及一系列之前没有观察到的对称组合,通过电子显微镜实验证实。以一种有点让人联想到从用户指定的输入中产生图像的网络的方式,RFdiffusion使人们可以从简单的语义分子规格中设计出多样化和复杂的蛋白质结构和功能。

Deep learning methods for protein design have shown considerable promise for sequence design, scaffolding functional sites, and building new monomers, cyclic oligomers, and antibody loops. Despite this progress, a general framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher order symmetric architectures, has yet to be described. Diffusion models have had considerable success in image and language generative modeling, and have been applied to the protein monomer generation problem, but with limited success, likely due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by utilizing powerful structure prediction methods as diffusion denoising networks, we can leverage the protein representations they have learned. We demonstrate state of the art performance on unconditional and topology constrained protein monomer design, protein and peptide binder design, symmetric oligomer design, enzyme active site scaffolding, and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold Diffusion (RFdiffusion), by experimentally characterizing hundreds of new designs. Highlights include a picomolar binder to parathyroid hormone, considerably higher affinity than any previous computational designed binder prior to experimental optimization, and a series of not-previously-observed symmetric assemblies experimentally confirmed by electron microscopy. In a manner somewhat reminiscent of networks which produce images from user-specified inputs, RFdiffusion makes accessible the design of diverse and complex protein architectures and functions from simple semantic molecular specifications.

论文链接:https://bakerlab.org/wp-content/uploads/2022/11/Diffusion_preprint_12012022.pdf

图片

图片

图片

图片