MIT | 大型语言模型中语言与思想的分离

来自今天的爱可可AI前沿推介

[CL] Dissociating language and thought in large language models: a cognitive perspective

K Mahowald, A A. Ivanova...
[The University of Texas at Austin & MIT & University of California Los Angeles]

大型语言模型中语言与思想的分离：认知视角

要点:

应该将当代 LLM 严格看做是形式语言技能的模型；
掌握现实生活中语言使用的模型不仅需要包含或开发一个核心语言模块，还需要包含或开发建模思维所需的多种非语言特定的认知能力；
仅仅通过擅长预测下一词是什么是无法达到类人的AGI的。

一句话总结:
应该将当代 LLM 严格看做是形式语言技能的模型，掌握现实生活中语言使用的模型还需要结合多种非语言特定的认知能力，才能以类似人的方式理解和使用语言。

摘要：
现在的大型语言模型(LLM)已经能生成连贯、符合语法且看似有意义的文本段落。这一成果使人们猜测，这些网络正在——或即将成为——“思考机器”，能执行需要抽象知识和推理的任务。本文通过考虑 LLM 在语用的两个不同方面的表现来回顾 LLM 的能力：“形式语言能力”，包括对给定语言规则和模式的知识，以及“功能语言能力”，这是在现实世界中理解和使用语言所需的一系列认知能力。根据认知神经科学的证据，本文表明，人的形式能力依赖于专门的语言处理机制，而功能能力集成了构成人类思想的多种语言外的能力，如形式推理、世界知识、情态建模和社会认知。根据这一区别，LLM 在需要形式语言能力的任务上表现出令人印象深刻(尽管不完美)的表现，但在许多需要功能能力的测试中失败。基于这些证据，本文认为 (1) 应该将当代 LLM 严格看做是形式语言技能的模型；(2) 掌握现实生活中语言使用的模型，不仅需要包含或开发一个核心语言模块，还需要包含或开发建模思维所需的多种非语言特定的认知能力。总体而言，区分形式语言能力和功能语言能力有助于澄清围绕 LLM 潜力的讨论，并为建立以类似人的方式理解和使用语言的模型提供了一条途径。

Today's large language models (LLMs) routinely generate coherent, grammatical and seemingly meaningful paragraphs of text. This achievement has led to speculation that these networks are -- or will soon become -- "thinking machines", capable of performing tasks that require abstract knowledge and reasoning. Here, we review the capabilities of LLMs by considering their performance on two different aspects of language use: 'formal linguistic competence', which includes knowledge of rules and patterns of a given language, and 'functional linguistic competence', a host of cognitive abilities required for language understanding and use in the real world. Drawing on evidence from cognitive neuroscience, we show that formal competence in humans relies on specialized language processing mechanisms, whereas functional competence recruits multiple extralinguistic capacities that comprise human thought, such as formal reasoning, world knowledge, situation modeling, and social cognition. In line with this distinction, LLMs show impressive (although imperfect) performance on tasks requiring formal linguistic competence, but fail on many tests requiring functional competence. Based on this evidence, we argue that (1) contemporary LLMs should be taken seriously as models of formal linguistic skills; (2) models that master real-life language use would need to incorporate or develop not only a core language module, but also multiple non-language-specific cognitive capacities required for modeling thought. Overall, a distinction between formal and functional linguistic competence helps clarify the discourse surrounding LLMs' potential and provides a path toward building models that understand and use language in human-like ways.

论文链接：https://arxiv.org/abs/2301.06627

内容中包含的图片若涉及版权问题，请及时与我们联系删除

MIT | 大型语言模型中语言与思想的分离

[CL] Dissociating language and thought in large language models: a cognitive perspective

评论列表

评论