来自爱可可的今日推介
[CL] Successive Prompting for Decomposing Complex Questions
D Dua, S Gupta, S Singh, M Gardner
[University of California, Irvine & Microsoft]
分解复杂问题的连续提示法
连续提示法是一种将复杂问题分解为简单的QA对的方法,从而使模块化QD和QA系统可以独立地进行训练和查询,这种模块化方法对于解决复杂任务比单独的大型语言模型更有效。
回答需要做出潜在决策的复杂问题是一项具有挑战性的任务,尤其是在监督有限的情况下。最近的工作利用大型语言模型(LM),通过演示如何在单次解答复杂问题的同时输出中间合理化,在少样本情况下执行复杂的问答。本文提出“连续提示”,将一个复杂的任务迭代地分解成简单任务并寻求解答,重复该过程,直到得到最终解答。连续提示将分解复杂问题的监督与回答简单问题的监督分开,使得能够 (1)在每个推理步骤中有多个机会查询上下文中的示例 (2)问题分解学习与问答学习分开,包括使用合成数据,以及 (3)在大型语言模型表现不佳的情况下使用定制(微调)组件进行推理步骤。中间监督通常是手动编写的,收集起来可能很昂贵。本文提出一种生成合成数据集的方法,该数据集可用于引导模型分解和回答中间问题的能力。与具有相同监督的最先进模型相比,最好的模型(基于连续提示)在 DROP 数据集的少样本版本上实现了约 5% 的绝对 F1 改进。
https://arxiv.org/abs/2212.04092
Answering complex questions that require making latent decisions is a challenging task, especially when limited supervision is available. Recent works leverage the capabilities of large language models (LMs) to perform complex question answering in a few-shot setting by demonstrating how to output intermediate rationalizations while solving the complex question in a single pass. We introduce ``Successive Prompting'', where we iteratively break down a complex task into a simple task, solve it, and then repeat the process until we get the final solution. Successive prompting decouples the supervision for decomposing complex questions from the supervision for answering simple questions, allowing us to (1) have multiple opportunities to query in-context examples at each reasoning step (2) learn question decomposition separately from question answering, including using synthetic data, and (3) use bespoke (fine-tuned) components for reasoning steps where a large LM does not perform well. The intermediate supervision is typically manually written, which can be expensive to collect. We introduce a way to generate a synthetic dataset which can be used to bootstrap a model's ability to decompose and answer intermediate questions. Our best model (with successive prompting) achieves an improvement of ~5% absolute F1 on a few-shot version of the DROP dataset when compared with a state-of-the-art model with the same supervision.
内容中包含的图片若涉及版权问题,请及时与我们联系删除
评论
沙发等你来抢