爱可可AI前沿推介(2.10)

LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

转自爱可可爱生活

1、[LG] Data-driven emergence of convolutional structure in neural networks

A Ingrosso, S Goldt

[The Abdus Salam International Centre for Theoretical Physics (ICTP) & International School of Advanced Studies (SISSA)]

数据驱动神经网络卷积结构的涌现。利用数据不变性对于人工和生物神经回路的高效学习至关重要。因此，了解神经网络如何发现可利用输入基本对称性的适当表示，对机器学习和神经科学至关重要。例如，卷积神经网络被设计为利用变换对称性，其能力引发了第一波深度学习的成功。然而，迄今为止，用一个全连接网络直接从变换不变的数据中学习卷积被证明是难以实现的。本文展示了解决分辨任务的原始的全连接神经网络，是如何直接从输入中学习卷积结构，最终形成局部性的、空间平铺的感受野的。这些感受野与在同一任务中训练的卷积网络的过滤器相匹配。通过精心设计的视觉场景数据模型表明，这种模式的涌现是由输入的非高斯、高阶局部结构引发的，而这早已被发现是自然图像的特征。非高斯和高阶统计是出现以定位和权重共享为特征的感受野的关键因素，在学习过程中从二阶统计到高阶统计的进展是神经网络如何学习越来越复杂的功能的一个例子。本文用一个简单的模型，对造成该现象的模式形成机制进行了分析和数值描述，结果发现感受野的形成与高阶输入相关性张量分解之间存在着意想不到的联系。该结果为各种感觉模态中低级特征检测器的开发提供了一个新的视角，并为研究高阶统计对神经网络学习的影响铺平了道路。

Exploiting data invariances is crucial for efficient learning in both artificial and biological neural circuits. Understanding how neural networks can discover appropriate representations capable of harnessing the underlying symmetries of their inputs is thus crucial in machine learning and neuroscience. Convolutional neural networks, for example, were designed to exploit translation symmetry and their capabilities triggered the first wave of deep learning successes. However, learning convolutions directly from translation-invariant data with a fully-connected network has so far proven elusive. Here, we show how initially fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs, resulting in localised, space-tiling receptive fields. These receptive fields match the filters of a convolutional network trained on the same task. By carefully designing data models for the visual scene, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs, which has long been recognised as the hallmark of natural images. We provide an analytical and numerical characterisation of the pattern-formation mechanism responsible for this phenomenon in a simple model, which results in an unexpected link between receptive field formation and the tensor decomposition of higher-order input correlations. These results provide a new perspective on the development of low-level feature detectors in various sensory modalities, and pave the way for studying the impact of higher-order statistics on learning in neural networks.

2、[LG] LyaNet: A Lyapunov Framework for Training Neural ODEs

I D J Rodriguez, A D. Ames, Y Yue

[California Institute of Technology]

LyaNet: 神经ODE训练的Lyapunov框架。本文研究了在用动态系统构建的网络训练架构背景下，学习和控制之间的新联系，提出一种通过用控制论的Lyapunov稳定条件来训练常微分方程的方法LyaNet，基于一种新的Lyapunov损失公式，鼓励推理动态快速收敛到正确的预测。理论表明，最小化Lyapunov损失可以保证指数级地收敛到正确的解，并实现一种新的鲁棒性保证。本文还提供了实用的算法，可避免通过求解器反向传播或使用邻接法的成本。相对于标准神经ODE训练，实验表明，LyaNet可以提供更好的预测性能，更快的推理动态收敛，以及更好的对抗性鲁棒性。

We propose a method for training ordinary differential equations by using a control-theoretic Lyapunov condition for stability. Our approach, called LyaNet, is based on a novel Lyapunov loss formulation that encourages the inference dynamics to converge quickly to the correct prediction. Theoretically, we show that minimizing Lyapunov loss guarantees exponential convergence to the correct solution and enables a novel robustness guarantee. We also provide practical algorithms, including one that avoids the cost of backpropagating through a solver or using the adjoint method. Relative to standard Neural ODE training, we empirically find that LyaNet can offer improved prediction performance, faster convergence of inference dynamics, and improved adversarial robustness. Our code available at https://github. com/ivandariojr/LyapunovLearning.

3、[LG] Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning

Y Kwon, J Zou

[Stanford University]

Beta Shapley：机器学习统一降噪数据评估框架。数据Shapley值最近被提出作为一个原则性的框架来量化机器学习中单个数据的贡献，可有效识别对学习算法有帮助或有害的数据点。本文提出了Beta Shapley，Data Shapley的一个实质性的推广。Beta Shapley通过放松Shapley值的效率公理而自然产生，这对机器学习场景并不要紧。Beta Shapley统一了几种流行的数据估值方法，并将数据Shapley作为一个特例。本文证明了Beta Shapley有几个理想的统计属性，并提出了有效的算法来估计它。证明了Beta Shapley在几个下游的机器学习任务上优于最先进的数据估值方法，例如。1）检测错误标记的训练数据；2）用子样本学习；以及3）识别增加或移除对模型有最大积极或消极影响的点。

Data Shapley has recently been proposed as a principled framework to quantify the contribution of individual datum in machine learning. It can effectively identify helpful or harmful data points for a learning algorithm. In this paper, we propose Beta Shapley, which is a substantial generalization of Data Shapley. Beta Shapley arises naturally by relaxing the efficiency axiom of the Shapley value, which is not critical for machine learning settings. Beta Shapley unifies several popular data valuation methods and includes data Shapley as a special case. Moreover, we prove that Beta Shapley has several desirable statistical properties and propose efficient algorithms to estimate it. We demonstrate that Beta Shapley outperforms state-of-the-art data valuation methods on several downstream ML tasks such as: 1) detecting mislabeled training data; 2) learning with subsamples; and 3) identifying points whose addition or removal have the largest positive or negative impact on the model.

4、[RO] Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning

S James, P Abbeel

[UC Berkeley]

强化学习3D旋转Bingham策略参数化。本文提出一种新的策略参数化，用于表示强化学习中的3D旋转。目前在连续控制强化学习文献中，许多随机策略参数化都是高斯的。本文认为，普遍应用高斯策略参数化并非对所有环境总是可取的。特别是涉及预测3D旋转输出的任务，要么是孤立的，要么是作为完整的6D姿势输出的一部分与转换相结合的。本文提出的Bingham策略参数化(BPP)对Bingham分布进行建模，在一系列强化学习任务中，比高斯策略参数化能更好地预测旋转(四元数)。在旋转Wahba问题任务以及RLBench的一组基于视觉的下一最佳姿态的机器人操纵任务上评估了BPP。希望本文能鼓励更多的研究，开发更适合于特定环境的其他策略参数化，而不是总是假设高斯。

We propose a new policy parameterization for representing 3D rotations during reinforcement learning. Today in the continuous control reinforcement learning literature, many stochastic policy parameterizations are Gaussian. We argue that universally applying a Gaussian policy parameterization is not always desirable for all environments. One such case in particular where this is true are tasks that involve predicting a 3D rotation output, either in isolation, or coupled with translation as part of a full 6D pose output. Our proposed Bingham Policy Parameterization (BPP) models the Bingham distribution and allows for better rotation (quaternion) prediction over a Gaussian policy parameterization in a range of reinforcement learning tasks. We evaluate BPP on the rotation Wahba problem task, as well as a set of vision-based next-best pose robot manipulation tasks from RLBench. We hope that this paper encourages more research into developing other policy parameterization that are more suited for particular environments, rather than always assuming Gaussian. Code available at: https://sites.google.com/view/rl-bpp.

5、[CL] Red Teaming Language Models with Language Models

E Perez, S Huang, F Song, T Cai, R Ring, J Aslanides, A Glaese, N McAleese, G Irving

[DeepMind]

面向语言模型安全测试的红队语言模型。语言模型(LM)无法部署，往往是因为其可能以难以预测的方式伤害用户。之前的工作是通过使用人类标注者手写测试用例，在部署前识别有害行为。然而，人工标注很昂贵，限制了测试用例的数量和多样性。本文通过使用另一个语言模型生成测试用例("红队")，自动发现目标语言模型以有害方式输出的情况。用训练过的分类器来评估目标语言模型对生成测试问题的回复，在一个280B参数的语言模型聊天机器人中发现了数以万计的攻击性回复。本文探索了几种方法，从零样本生成到强化学习，以生成具有不同程度多样性和难度的测试用例。用提示工程来控制语言模型生成的测试用例，以发现其他各种危害，自动找到聊天机器人以攻击性方式讨论的群组、作为聊天机器人自己的联系信息生成的个人和医院电话号码、生成文本中私人训练数据的泄漏，以及在对话过程中发生的危害。总的来说，基于语言模型的红队是一个很有前途的工具(在众多需要的工具中)，可以在影响用户之前找到并修复不同的、不受欢迎的语言模型行为。

Language Models (LMs) often cannot be deployed because of their potential to harm users in hard-to-predict ways. Prior work identifies harmful behaviors before deployment by using human annotators to hand-write test cases. However, human annotation is expensive, limiting the number and diversity of test cases. In this work, we automatically find cases where a target LM behaves in a harmful way, by generating test cases (“red teaming”) using another LM. We evaluate the target LM’s replies to generated test questions using a classifier trained to detect offensive content, uncovering tens of thousands of offensive replies in a 280B parameter LM chatbot. We explore several methods, from zero-shot generation to reinforcement learning, for generating test cases with varying levels of diversity and difficulty. Furthermore, we use prompt engineering to control LM-generated test cases to uncover a variety of other harms, automatically finding groups of people that the chatbot discusses in offensive ways, personal and hospital phone numbers generated as the chatbot’s own contact info, leakage of private training data in generated text, and harms that occur over the course of a conversation. Overall, LM-based red teaming is one promising tool (among many needed) for finding and fixing diverse, undesirable LM behaviors before impacting users.

另外几篇值得关注的论文：

[CL] PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

PromptSource：自然语言提示集成开发环境和存储库

S H. Bach, V Sanh, Z Yong, A Webson, C Raffel, N V. Nayak, A Sharma, T Kim, M S Bari, T Fevry...

[Brown University & Hugging Face...]

[CV] Causal Scene BERT: Improving object detection by searching for challenging groups of data

因果场景BERT：通过挑战性数据组搜索改进目标检测

C Resnick, O Litany, A Kar, K Kreis, J Lucas, K Cho, S Fidler

[New York University & NVIDIA]

[CV] 3D Object Detection from Images for Autonomous Driving: A Survey

无人驾驶图像3D目标检测综述

X Ma, W Ouyang, A Simonelli, E Ricci

[University of Sydney & University of Trento]

[LG] The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective

可解释机器学习中的分歧问题：实践者视角

S Krishna, T Han, A Gu, J Pombra, S Jabbari, S Wu, H Lakkaraju

[Harvard University & MIT & Drexel University & CMU]

内容中包含的图片若涉及版权问题，请及时与我们联系删除