[CL] Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models

K Zhou, D Jurafsky, T Hashimoto
[Stanford University]

行走灰色地带: 语言模型的过度自信与不确定表达

要点:

  1. 语言模型(LM)缺乏生成和解释不确定表达的能力,而这对人工决策和沟通至关重要;

  2. 根据提示中使用的不确定表达方式,语言模型生成的准确性差异很大(最高可达80%);

  3. 当语言模型被教导发出确定的表达而非不确定表达时,语言模型的校准会受到影响,而自然的确定表达会导致准确性的下降;

  4. 本文提供了一个分析不确定表达和语言模型间相互作用的框架,介绍了一种评估语言特征如何影响语言模型生成的类型学,并为自然语言生成和自然的不确定表达提供了建议。

一句话总结:
语言模型在解释和产生不确定表达方面存在困难,导致在构建能产生可信语言的模型方面面临挑战。

Despite increasingly fluent, relevant, and coherent language generation, major gaps remain between how humans and machines use language. We argue that a key dimension that is missing from our understanding of language models (LMs) is the model's ability to interpret and generate expressions of uncertainty. Whether it be the weatherperson announcing a chance of rain or a doctor giving a diagnosis, information is often not black-and-white and expressions of uncertainty provide nuance to support human-decision making. The increasing deployment of LMs in the wild motivates us to investigate whether LMs are capable of interpreting expressions of uncertainty and how LMs' behaviors change when learning to emit their own expressions of uncertainty. When injecting expressions of uncertainty into prompts (e.g., "I think the answer is..."), we discover that GPT3's generations vary upwards of 80% in accuracy based on the expression used. We analyze the linguistic characteristics of these expressions and find a drop in accuracy when naturalistic expressions of certainty are present. We find similar effects when teaching models to emit their own expressions of uncertainty, where model calibration suffers when teaching models to emit certainty rather than uncertainty. Together, these results highlight the challenges of building LMs that interpret and generate trustworthy expressions of uncertainty.

论文链接:https://arxiv.org/abs/2302.13439
图片
图片
图片
图片