本文来自今日的爱可可前沿推介

[LG] A high-level programming language for generative protein design

B Hie, S Candido, Z Lin, O Kabeli, R Rao, N Smetanin, T Sercu, A Rives
[FAIR]

面向生成式蛋白质设计的高级编程语言

要点:

  1. 组合基本构建块来形成复杂的形式是一种普遍的设计原则;
  2. 设计了一种基于模块化构建块的高级编程语言,允许设计者轻松编写一组所需的属性;
  3. 开发了一种基于能量的生成模型,用基于语言模型的原子分辨率结构预测,实现了具有编程特性的全原子结构设计。

摘要将一组基本的积木组合成更复杂的形式是一项通用的设计原则。大多数蛋白质设计都是从使用自然创造的部件的手动自下而上的方法进行的,但由于生物复杂性,蛋白质的自上而下设计从根本上来说很难。本文演示了如何通过生成式人工智能实现长期以来为蛋白质设计寻求的模块化和可编程。先进的蛋白质语言模型展示了原子分辨率结构和蛋白质设计原理的新兴学习。利用这些进展来实现d从头合成蛋白质序列和高复杂结构的可编程设计。描述了一种基于模块化构建块的高级编程语言,允许设计者轻松编写一组所需的属性;开发了一种基于能量的生成模型,用基于语言模型的原子分辨率结构预测,实现了具有编程特性的全原子结构设计。设计一套多样化的规范,包括对原子坐标、次要结构、对称性和多聚化的约束,证明了该方法的普遍性和可控性。列举越来越高的层次复杂性的约束表明,该方法可以访问一个组合式的大型设计空间。

https://biorxiv.org/content/10.1101/2022.12.21.521526v1

Combining a basic set of building blocks into more complex forms is a universal design principle. Most protein designs have proceeded from a manual bottom-up approach using parts created by nature, but top-down design of proteins is fundamentally hard due to biological complexity. We demonstrate how the modularity and programmability long sought for protein design can be realized through generative artificial intelligence. Advanced protein language models demonstrate emergent learning of atomic resolution structure and protein design principles. We leverage these developments to enable the programmable design of de novo protein sequences and structures of high complexity. First, we describe a high-level programming language based on modular building blocks that allows a designer to easily compose a set of desired properties. We then develop an energy-based generative model, built on atomic resolution structure prediction with a language model, that realizes all-atom structure designs that have the programmed properties. Designing a diverse set of specifications, including constraints on atomic coordinates, secondary structure, symmetry, and multimerization, demonstrates the generality and controllability of the approach. Enumerating constraints at increasing levels of hierarchical complexity shows that the approach can access a combinatorially large design space.






 

 

内容中包含的图片若涉及版权问题,请及时与我们联系删除