来自今天的爱可可AI前沿推介
[LG] Online Low Rank Matrix Completion
(2023)
在线低秩矩阵补全
要点:
-
提出一种在线低秩矩阵补全方法,可利用奖励矩阵的低秩结构; -
采用先探索-再提交(ETC)方法,确保了 O(polylog(M+N)T^(2/3)) 遗憾,可摆脱对 N 的依赖; -
进一步改进了对 T 的依赖,提出了 rank-1 设置的 OCTAL,保证了 O(polylog(M+N)T^(1/2)) 的接近最优的遗憾。
一句话总结:
提出一种称为OCTAL的在线低秩矩阵补全的新算法,通过利用奖励矩阵的低秩结构和基于用户的真实潜表示进行聚类,保证了O(polylog(M+N)T1/2)的近最优遗憾。
摘要:
本文研究基于M个用户、N个项目和T轮的在线低秩矩阵补全问题。在每一轮,算法为每个用户推荐一个项目,会得到一个从低秩用户-项目偏好矩阵采样的(噪声)奖励。本文目标是设计一个具有(T)亚线性遗憾和对M和N近乎最佳依赖的方法。该问题可以很容易地映射到标准的多臂老虎机问题,其中每个项目都是一个独立的臂,但这导致了糟糕的遗憾,因为臂和用户之间的相关性没有被利用。另一方面,由于低秩流形的非凸性,利用奖励矩阵的低秩结构是一个挑战。本文证明低秩结构可以通过一个简单的 “先探索-再承诺”(ETC)方法来利用。每个用户大约只需要polylog(M+N)次项目推荐,就可以得到一个非平凡解。本文对 rank-1 设置的结果进行了改进,其本身就是一个相当大的挑战,并概括了一些关键问题。本文提出OCTAL(迭代用户聚类在线协同过滤),保证了近乎最优的遗憾。 OCTAL基于一种新的用户聚类技术,允许迭代消除项目,可得到近乎最优的最小最大比率。
We study the problem of online low-rank matrix completion with users, items and rounds. In each round, the algorithm recommends one item per user, for which it gets a (noisy) reward sampled from a low-rank user-item preference matrix. The goal is to design a method with sub-linear regret (in ) and nearly optimal dependence on and . The problem can be easily mapped to the standard multi-armed bandit problem where each item is an independent arm, but that leads to poor regret as the correlation between arms and users is not exploited. On the other hand, exploiting the low-rank structure of reward matrix is challenging due to non-convexity of the low-rank manifold. We first demonstrate that the low-rank structure can be exploited using a simple explore-then-commit (ETC) approach that ensures a regret of . That is, roughly only item recommendations are required per user to get a non-trivial solution. We then improve our result for the rank- setting which in itself is quite challenging and encapsulates some of the key issues. Here, we propose OCTAL (Online Collaborative filTering using iterAtive user cLustering) that guarantees nearly optimal regret of . OCTAL is based on a novel technique of clustering users that allows iterative elimination of items and leads to a nearly optimal minimax rate.