MIT | 无需完整模型的迁移学习

来自今天的爱可可AI前沿推介

[CL] Offsite-Tuning: Transfer Learning without Full Model

G Xiao, J Lin, S Han
[MIT]

Offsite-Tuning: 无需完整模型的迁移学习

要点:

Offsite-Tuning 是一种保护隐私且高效的迁移学习框架，允许在不接触完整模型的情况下将大型基础模型适应于下游数据；
Offsite-Tuning 涉及到用一个轻量的适配器和一个有损压缩的仿真器，来对下游数据进行微调，保留了双方的隐私，并且在计算上比现有的微调方法更高效；
该框架可以达到与完整模型微调相当的精度，同时又能保护隐私和效率，实现了高达6.5倍的速度提升和高达5.6倍的内存减少；
Offsite-Tuning 可以实现以前在单个GPU上无法实现的模型微调，如OPT-6.7B和BLOOM-7.1B。

一句话总结:
提出 Offsite-Tuning，一种保护隐私且高效的迁移学习框架，允许在不接触完整模型的情况下将大型基础模型适应于下游数据。该框架包括模型所有者向数据所有者发送一个轻量级的适配器和一个有损压缩的仿真器，后者在仿真器的帮助下对下游数据的适配器进行微调。经过微调的适配器被返回给模型所有者，并插入到完整的模型中，为下游用户创建一个自适应的基础模型。

摘要：
迁移学习对于基础模型适应下游任务很重要。然而，许多基础模型是专有的，所以用户必须与模型所有者分享他们的数据以微调模型，这是很昂贵的，并引起了隐私问题。此外，微调大型基础模型是计算密集型的，对大多数下游用户来说不切实际。本文提出 Offsite-Tuning，一种保护隐私且高效的迁移学习框架，可以在不接触完整模型的情况下将十亿参数的基础模型适应于下游数据。在 Offsite-Tuning 中，模型所有者向数据所有者发送一个轻量的适配器和一个有损压缩的仿真器，在仿真器的帮助下对下游数据的适配器进行微调。微调后的适配器被返回给模型所有者，并将适配器插入完整模型中，以创建一个自适应的基础模型。Offsite-Tuning 保留了双方的隐私，并且比现有的需要访问完整模型权重的微调方法在计算上更高效。在各种大型语言和视觉基础模型上证明了 Offsite-Tuning 的有效性。Offsite-Tuning 可以达到与全模型微调相当的精度，同时又能保护隐私和效率，实现了6.5倍的速度提升和5.6倍的内存减少。

Transfer learning is important for foundation models to adapt to downstream tasks. However, many foundation models are proprietary, so users must share their data with model owners to fine-tune the models, which is costly and raise privacy concerns. Moreover, fine-tuning large foundation models is computation-intensive and impractical for most downstream users. In this paper, we propose Offsite-Tuning, a privacy-preserving and efficient transfer learning framework that can adapt billion-parameter foundation models to downstream data without access to the full model. In offsite-tuning, the model owner sends a light-weight adapter and a lossy compressed emulator to the data owner, who then fine-tunes the adapter on the downstream data with the emulator's assistance. The fine-tuned adapter is then returned to the model owner, who plugs it into the full model to create an adapted foundation model. Offsite-tuning preserves both parties' privacy and is computationally more efficient than the existing fine-tuning methods that require access to the full model weights. We demonstrate the effectiveness of offsite-tuning on various large language and vision foundation models. Offsite-tuning can achieve comparable accuracy as full model fine-tuning while being privacy-preserving and efficient, achieving 6.5x speedup and 5.6x memory reduction. Code is available at this https URL.

论文链接：https://arxiv.org/abs/2302.04870

内容中包含的图片若涉及版权问题，请及时与我们联系删除

MIT | 无需完整模型的迁移学习

[CL] Offsite-Tuning: Transfer Learning without Full Model

评论