Scientific research is having its LLM agent moment

科学研究正迎来 LLM 智能体的转折点

Just as the science of AI agents advances, AI agents are beginning to advance science.

正如 AI 智能体科学的发展,AI 智能体开始推动科学发展。

The past year has seen an explosive growth of interest in AI agents, particularly for augmenting — or entirely automating — human labor in enterprise settings. However, their ability to speed up difficult cognitive jobs has led AI agents to be increasingly employed in another setting: scientific research.

过去一年,人们对 AI 智能体的兴趣呈爆炸式增长(https://trends.google.com/trends/explore?geo=US&q=ai%20agents&hl=en),特别是在企业环境(https://www.buildingaiagents.ai/p/big-tech-all-in-on-agents)中增强或完全自动化人类劳动方面。然而,它们加速复杂认知工作的能力,使得 AI 智能体越来越多地被应用于另一个领域:科学研究(https://arxiv.org/abs/2412.11427)。

One obvious application which dates back to the very beginning of the large language model boom is summarizing the state of research on a given subject, a task which many scientists find to be one of the most burdensome parts of the job as the existing corpus of literature — much of it low-quality — grows exponentially. Almost immediately after models such as GPT-3.5 became available, researchers and hackers began using them to build simple software to scrape this vast text space and distill it into useful insights.

一个显而易见的早期应用,可以追溯到大型语言模型热潮的最初阶段,即总结某一主题的研究现状,这项任务被许多科学家视为工作中最繁琐的部分之一,因为现有的文献库——其中大部分质量较低——正呈指数级增长。几乎在 GPT-3.5 等模型可用后,研究人员和黑客就开始使用它们来构建简单的软件,从这片广阔的文本空间中抓取信息并提炼出有用的见解。

However, these research summarizers do not qualify as agents, as they lack any sort of planning, self-reflection, or ability to perform sophisticated actions. With the steady rise of agentic AI, a new generation of LLM-based scientific applications have aimed to automate the entire process of scientific discovery, from hypothesis generation to experimentation to manuscript production. In June of last year, Sakana AI created a stir with their “AI Scientist”, which purported to produce novel research in machine learning — though critics were quick to point out its limitations. Since then, numerous other groups have sought to create agents capable of automating machine learning or data science research, leading some AI safety researchers to monitor these systems’ potential to create self-improving AI.

然而,这些研究总结工具并不符合智能体的标准,因为它们缺乏任何形式的规划、自我反思或执行复杂操作的能力。随着智能体 AI 的稳步发展,新一代基于 LLM 的科学应用旨在自动化整个科学发现过程,从假设生成到实验再到文稿制作。去年六月,Sakana AI(https://sakana.ai/blog/)凭借其“AI 科学家(https://sakana.ai/ai-scientist)”引发了轰动,声称能够产出机器学习领域的新研究——尽管批评者迅速指出了其局限性(https://arstechnica.com/information-technology/2024/08/research-ai-model-unexpectedly-modified-its-own-code-to-extend-runtime)。自那以后,许多其他团队都试图创建能够自动化机器学习(https://arxiv.org/abs/2501.04227)或数据科学(https://arxiv.org/abs/2408.09667)研究过程的智能体,这导致一些 AI 安全研究人员开始监控这些系统创造自我改进 AI 的潜力(https://metr.org/AI_R_D_Evaluation_Report.pdf)。

While machine learning research is a tempting target for science agents due to its ability to be performed entirely in silico, scientists have also begun to apply them to the harder sciences. Many of these cases involve an inversion of peoples’ expectations of AI — instead of a human scientist directing robot workers, an AI agent designs constructs such as nanobodies against SARS-CoV-2, and a human worker creates and tests them in a lab.

尽管机器学习研究因其完全可以在计算机中进行的特性而成为科学智能体的诱人目标,科学家们也开始将它们应用于更难的学科。这些案例中的许多都涉及人们对人工智能的预期反转——不是由人类科学家指挥机器人工人,而是由人工智能智能体设计针对 SARS-CoV-2 的纳米抗体(https://www.biorxiv.org/content/10.1101/2024.11.11.623004v1.full),人类工人在实验室中创造并测试它们。

With a human in the loop, however, these efforts still fall short of the Holy Grail of AI agent-powered research: a “self-driving lab” in which every step — including benchtop experiments — is executed entirely by computers. As ambitious as this sounds, researchers have begun making strides towards it. In early 2024, a pair of groups respectively debuted ORGANA and ChemCrow, sophisticated LLM agents capable of planning the synthesis of chemical compounds with given properties, then actually running them in a real lab via robotic synthesis platforms. At least one startup, inspired by their work, is seeking to commercialize similar technology.

然而,在人类参与的情况下,这些努力仍然未能达到 AI 智能体驱动研究的圣杯:一个“自动驾驶实验室(https://www.sciencedirect.com/science/article/pii/S0092867424010705)”,其中每一步——包括台面实验——都完全由计算机执行。尽管听起来如此雄心勃勃,研究人员已经开始朝着这个目标迈进。2024 年初,两个团队分别推出了 ORGANA(https://arxiv.org/abs/2401.06949)和 ChemCrow(https://www.nature.com/articles/s42256-024-00832-8),这两种复杂的 LLM 智能体能够根据给定属性规划化学化合物的合成,然后通过机器人合成平台在真实实验室中实际运行它们。至少有一家创业公司(https://techcrunch.com/2024/12/22/tetsuwan-scientific-is-making-robotic-ai-scientists-that-can-run-experiments-on-their-own/)受其启发,正寻求商业化类似技术。

While enterprise AI agents may seem dull to some, evoking images of spreadsheets and office drudgery, or raising fears of job displacement, scientific agents hold a unambiguously hopeful promise: drastically speeding up the entire enterprise of science, saving millions of lives through medical advances and tackling such global issues as climate change and food insecurity. If this potential is ultimately fulfilled, they could constitute AI agents’ most important contribution to humanity.

虽然企业级 AI 智能体对某些人来说可能显得枯燥,让人联想到电子表格和办公室的繁重工作,或者引发对工作被取代的担忧,但科学智能体却承载着明确而充满希望的承诺:极大地加速整个科学事业的发展,通过医疗进步拯救数百万人的生命,并应对气候变化和粮食不安全等全球性问题。如果这种潜力最终得以实现,它们将成为 AI 智能体对人类最重要的贡献。


微信群

内容中包含的图片若涉及版权问题,请及时与我们联系删除