- 简介神经嵌入模型已成为现代信息检索(IR)流程的基本组成部分。这些模型为每个数据点产生一个嵌入$x \in \mathbb{R}^d$,通过高度优化的最大内积搜索(MIPS)算法实现快速检索。最近,从具有里程碑意义的ColBERT论文开始,多向量模型为IR任务产生了明显优越的性能。然而,由于多向量检索和评分的复杂性增加,使用这些模型进行IR是计算上昂贵的。在本文中,我们介绍了MUVERA(MUlti-VEctor Retrieval Algorithm),这是一种检索机制,它将多向量相似性搜索降低到单向量相似性搜索。这使得可以使用现成的MIPS求解器进行多向量检索。MUVERA不对称地生成查询和文档的固定维度编码(FDEs),这些向量的内积近似于多向量相似性。我们证明了FDEs提供了高质量的$\epsilon$-近似,从而提供了第一个具有理论保证的多向量相似性的单向量代理。实验上,我们发现FDEs实现了与之前最先进的启发式方法相同的召回率,同时检索的候选项数量减少了2-5倍。与之前的最先进实现相比,MUVERA在BEIR检索数据集的各种情况下始终实现了良好的端到端召回率和延迟,平均实现了10%的提高召回率和90%的降低延迟。
- 图表
- 解决问题MUVERA aims to reduce the computational cost of using multi-vector models for information retrieval (IR) tasks, while maintaining high performance.
- 关键思路MUVERA generates Fixed Dimensional Encodings (FDEs) of queries and documents, which are single vectors that approximate multi-vector similarity. This allows for the use of off-the-shelf MIPS solvers for multi-vector retrieval, reducing computational complexity.
- 其它亮点MUVERA achieves the same recall as prior state-of-the-art heuristics while retrieving 2-5 times fewer candidates. It consistently achieves good end-to-end recall and latency across a diverse set of BEIR retrieval datasets, with an average of 10% improved recall and 90% lower latency. The paper provides theoretical guarantees for the quality of FDEs as a proxy for multi-vector similarity.
- Prior work in this field includes the ColBERT paper, which introduced multi-vector models for IR tasks, and other heuristics for reducing the computational cost of multi-vector retrieval.
沙发等你来抢
去评论
评论
沙发等你来抢