来自今天的爱可可AI前沿推介

[CL] Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer

Z Jiang, L Gao, J Araki, H Ding, Z Wang, J Callan, G Neubig
[CMU & Bosch Research]

检索即注意力:在单个Transformer中对检索和阅读进行端到端学习

简介:提出检索即注意力(ReAt),一种只基于终端任务监督的端到端学习的单个Transformer模型,展示了具有竞争力的检索和质量保证性能,并在有监督和无监督的情况下容易自适应到其他域。

摘要:用于知识密集型任务的系统,如开放域问答(QA),通常包括两个阶段:从大型语料库中有效地检索相关文档,并详细阅读选定文档以生成答案。检索器和阅读器通常是分开建模的,这就需要繁琐的实现,而且很难以端到端方式进行训练和调整。本文重新审视了这种设计,摒弃了单独的架构和训练,而采用了一个执行检索即注意(ReAt)的单个Transformer,以及完全基于终端QA任务的监督的端到端训练。本文首次证明,一个端到端训练的单一模型可以实现有竞争力的检索和QA性能,与最先进的单独训练的检索器和阅读器相匹配或略胜一筹。此外,在有监督和无监督的情况下,端到端的适应性大大提升了其在域外数据集上的性能,使该模型成为知识密集型任务的简单和可适应的解决方案。

Systems for knowledge-intensive tasks such as open-domain question answering (QA) usually consist of two stages: efficient retrieval of relevant documents from a large corpus and detailed reading of the selected documents to generate answers. Retrievers and readers are usually modeled separately, which necessitates a cumbersome implementation and is hard to train and adapt in an end-to-end fashion. In this paper, we revisit this design and eschew the separate architecture and training in favor of a single Transformer that performs Retrieval as Attention (ReAtt), and end-to-end training solely based on supervision from the end QA task. We demonstrate for the first time that a single model trained end-to-end can achieve both competitive retrieval and QA performance, matching or slightly outperforming state-of-the-art separately trained retrievers and readers. Moreover, end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings, making our model a simple and adaptable solution for knowledge-intensive tasks. Code and models are available at this https URL.

论文链接:https://arxiv.org/abs/2212.02027

图片

图片

图片