MIT｜GPT-4不具备推理能力

GPT-4不具备推理能力，GPT-4 Can't Reason

K Arkoudas
[MIT]

GPT-4在逻辑推理能力方面存在很大缺陷。本文让GPT-4求解21个简单的推理问题，它都无法正确解决，反复出现内在自相矛盾的陈述。
GPT-4无法进行基本的算术运算、计数和空间推理，也无法正确理解组合、图论、命题逻辑、量词逻辑等基本数学原理。
GPT-4无法通过反证法进行简单的逻辑推理，也无法正确理解条件语句的语义，经常重复同样的错误。
GPT-4无法有效利用搜索引擎或知识图谱来纠正经验性事实方面的错误，但逻辑推理需要内在一致性，这是更难的问题。
依赖外部系统进行复杂推理也面临困难，因为规划与分解问题本身就需要推理能力。
当前的GPT-4完全无法进行推理，它的错误太广泛且过于严重，这反驳了GPT-4具有人类平均水平推理能力的说法。
在科学、医学和工程中使用生成式AI存在严重风险，推理的规范性标准至关重要。
如果LLM的推理继续改进，严格的证明检查可能会变得越来越重要。
目前来看,LLM掌控人类的可怕场景纯属空想。
这篇文章通过详细的质性分析，而不是统计指标，深入剖析了GPT-4的推理能力，得出了它完全无法进行推理的结论。

动机：探讨GPT-4的推理能力，并对其在多个推理任务上的表现进行深入分析。
方法：通过对GPT-4在21个不同推理问题上的表现进行详细的定性分析。
优势：提供了对GPT-4推理能力的深入洞察，揭示了其在推理任务上的局限性。

尽管GPT-4在多个方面取得了显著的进步，但它在推理任务上的表现仍然存在明显的不足。

GPT-4 was released in March 2023 to wide acclaim, marking a very substantial improvement across the board over GPT-3.5 (OpenAI’s previously best model, which had powered the initial release of ChatGPT). Despite the genuinely impressive improvement, however, there are good reasons to be highly skeptical of GPT-4’s ability to reason. This position paper discusses the nature of reasoning; criticizes the current formulation of reasoning problems in the NLP community and the way in which the reasoning performance of LLMs is currently evaluated; introduces a collection of 21 diverse reasoning problems; and performs a detailed qualitative analysis of GPT-4’s performance on these problems. Based on the results of that analysis, this paper argues that, despite the occasional ﬂashes of analytical brilliance, GPT-4 at present is utterly incapable of reasoning.

https://preprints.org/manuscript/202308.0148/v1

内容中包含的图片若涉及版权问题，请及时与我们联系删除

MIT｜GPT-4不具备推理能力

评论