Controlling Language and Diffusion Models by Transporting Activations

Large generative models are increasingly powerful and widely used in production applications, yet achieving precise control over their outputs remains challenging. Fine-grained control is crucial for meeting user expectations, ensuring reliability, and mitigating misuse risks. Apple's machine learning researchers have introduced Activation Transport (AcT), a modality-agnostic technique that offers detailed control over model behavior with minimal computational overhead and negligible impact on model capabilities. AcT leverages optimal transport theory to steer activations, generalizing many prior activation-steering methods. This approach helps align generative model outputs with user expectations while maintaining safety and performance. The research will be featured as a Spotlight at ICLR 2025, with code available for public use. By advancing control mechanisms, AcT enhances the precision and trustworthiness of generative models across various applications.

本专栏通过快照技术转载，仅保留核心内容

内容中包含的图片若涉及版权问题，请及时与我们联系删除

Controlling Language and Diffusion Models by Transporting Activations

评论列表

评论