CMU | 检索即注意力：在单个Transformer中对检索和阅读进行端到端学习

来自今天的爱可可AI前沿推介

[CL] Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer

Z Jiang, L Gao, J Araki, H Ding, Z Wang, J Callan, G Neubig
[CMU & Bosch Research]

检索即注意力：在单个Transformer中对检索和阅读进行端到端学习

简介：提出检索即注意力(ReAt)，一种只基于终端任务监督的端到端学习的单个Transformer模型，展示了具有竞争力的检索和质量保证性能，并在有监督和无监督的情况下容易自适应到其他域。

摘要：用于知识密集型任务的系统，如开放域问答(QA)，通常包括两个阶段：从大型语料库中有效地检索相关文档，并详细阅读选定文档以生成答案。检索器和阅读器通常是分开建模的，这就需要繁琐的实现，而且很难以端到端方式进行训练和调整。本文重新审视了这种设计，摒弃了单独的架构和训练，而采用了一个执行检索即注意(ReAt)的单个Transformer，以及完全基于终端QA任务的监督的端到端训练。本文首次证明，一个端到端训练的单一模型可以实现有竞争力的检索和QA性能，与最先进的单独训练的检索器和阅读器相匹配或略胜一筹。此外，在有监督和无监督的情况下，端到端的适应性大大提升了其在域外数据集上的性能，使该模型成为知识密集型任务的简单和可适应的解决方案。

Systems for knowledge-intensive tasks such as open-domain question answering (QA) usually consist of two stages: efficient retrieval of relevant documents from a large corpus and detailed reading of the selected documents to generate answers. Retrievers and readers are usually modeled separately, which necessitates a cumbersome implementation and is hard to train and adapt in an end-to-end fashion. In this paper, we revisit this design and eschew the separate architecture and training in favor of a single Transformer that performs Retrieval as Attention (ReAtt), and end-to-end training solely based on supervision from the end QA task. We demonstrate for the first time that a single model trained end-to-end can achieve both competitive retrieval and QA performance, matching or slightly outperforming state-of-the-art separately trained retrievers and readers. Moreover, end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings, making our model a simple and adaptable solution for knowledge-intensive tasks. Code and models are available at this https URL.

论文链接：https://arxiv.org/abs/2212.02027

内容中包含的图片若涉及版权问题，请及时与我们联系删除

CMU | 检索即注意力：在单个Transformer中对检索和阅读进行端到端学习

[CL] Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer

评论