Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

简介

在人类认知理论中，人类的思维受到两个系统的支配：快速和直觉的系统1和更缓慢但更深思熟虑的系统2。最近的研究表明，将系统2过程纳入到变形金刚中，包括大型语言模型（LLMs），显著增强了它们的推理能力。然而，纯粹类似于系统2思维的模型需要更高的计算成本，并且响应速度更慢。为了解决这个挑战，我们提出了Dualformer，这是一个单一的变形金刚模型，无缝地集成了快速和缓慢的推理模式。Dualformer是通过在具有随机推理轨迹的数据上进行训练获得的，在训练过程中会丢弃轨迹的不同部分。丢弃策略根据轨迹结构进行特定的定制，类似于分析我们的思维过程并创建模式的快捷方式。在推理时，我们的模型可以配置为仅输出解决方案（快速模式）或推理链和最终解决方案（缓慢模式），或自动决定使用哪种模式（自动模式）。在所有情况下，Dualformer在性能和计算效率方面都优于相应的基线模型：（1）在缓慢模式下，Dualformer可以最佳地解决未见过的30 x 30迷宫导航任务的97.6％，超过了Searchformer（在具有完整推理轨迹的数据上进行训练）基线性能的93.3％，同时仅使用了45.5％的推理步骤；（2）在快速模式下，Dualformer以80％的最佳率完成这些任务，明显优于仅解决方案模型（仅在解决方案数据上进行训练），其最佳率仅为30％。对于数学问题，我们的技术也通过LLM微调实现了改进，显示了它在任务特定模型之外的泛化能力。
图表
解决问题

Dualformer: Integrating Fast and Slow Reasoning for Reading Comprehension with a Single Transformer Model
关键思路

The paper proposes Dualformer, a single Transformer model that integrates both fast and slow reasoning modes to enhance reading comprehension. This is achieved by training on data with randomized reasoning traces, where different parts of the traces are dropped during training, and the dropping strategies are tailored according to the trace structure. At inference time, the model can be configured to output only the solutions (fast mode) or both the reasoning chain and the final solution (slow mode), or automatically decide which mode to engage (auto mode).
其它亮点

The experiments show that Dualformer outperforms the baseline models in both performance and computational efficiency. In slow mode, Dualformer optimally solves unseen 30 x 30 maze navigation tasks 97.6% of the time, surpassing the Searchformer baseline performance of 93.3%, while using 45.5% fewer reasoning steps. In fast mode, Dualformer completes those tasks with an 80% optimal rate, significantly outperforming the Solution-Only model, which has an optimal rate of only 30%. The proposed techniques have also achieved improved performance with LLM fine-tuning, showing its generalization beyond task-specific models.
相关研究

Recent studies have shown that incorporating System 2 process into Transformers, including large language models (LLMs), significantly enhances their reasoning capabilities. However, models that purely resemble System 2 thinking require substantially higher computational costs and are much slower to respond. Previous works have proposed different approaches to address this challenge, such as Searchformer and Solution-Only models.

PDF

原文

点赞收藏评论分享到Link

沙发等你来抢

去评论