爱可可AI前沿推介(12.5)

LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

1、[LG] Quantum advantage in learning from experiments

H Huang, M Broughton...

[Caltech & Google Quantum AI & Harvard Society of Fellows & Black Hole Initiative & UC Berkeley & Microsoft Research AI & Johannes Kepler University Linz]

实验学习的量子优势。量子技术有可能彻底改变获取和处理实验数据以了解物理世界的方式。一个将数据从物理系统迁移到稳定的量子存储器，并使用量子计算机处理这些数据的实验装置，可以比传统的实验具有显著的优势，在传统实验中，物理系统被测量，结果用经典计算机处理。本文证明，在各种任务中，量子机可以从比传统实验所需的实验数量少得多的实验中学习。在预测物理系统属性、对噪声状态进行量子主成分分析以及学习物理动力学近似模型方面，指数级优势是成立的。在某些任务中，实现指数级优势所需的量子处理可以是适度的；例如，只需处理系统的两个副本，就可以同时了解许多非交换的观测变量。通过对多达40个超导量子比特和1300个量子门进行实验，证明了使用今天相对含噪的量子处理器可以实现巨大的量子优势。结果强调了量子技术如何能够实现强大的新策略来学习自然。

Quantum technology has the potential to revolutionize how we acquire and process experimental data to learn about the physical world. An experimental setup that transduces data from a physical system to a stable quantum memory, and processes that data using a quantum computer, could have significant advantages over conventional experiments in which the physical system is measured and the outcomes are processed using a classical computer. We prove that, in various tasks, quantum machines can learn from exponentially fewer experiments than those required in conventional experiments. The exponential advantage holds in predicting properties of physical systems, performing quantum principal component analysis on noisy states, and learning approximate models of physical dynamics. In some tasks, the quantum processing needed to achieve the exponential advantage can be modest; for example, one can simultaneously learn about many noncommuting observables by processing only two copies of the system. Conducting experiments with up to 40 superconducting qubits and 1300 quantum gates, we demonstrate that a substantial quantum advantage can be realized using today's relatively noisy quantum processors. Our results highlight how quantum technology can enable powerful new strategies to learn about nature.

https://weibo.com/1402400261/L4A7SE0Wa

2、[CL] A General Language Assistant as a Laboratory for Alignment

A Askell, Y Bai, A Chen, D Drain, D Ganguli, T Henighan, A Jones, N Joseph, B Mann, N DasSarma, N Elhage, Z Hatfield-Dodds, D Hernandez, J Kernion, K Ndousse, C Olsson, D Amodei, T Brown, J Clark, S McCandlish, C Olah, J Kaplan

[Anthropic]

作为对齐实验室的通用语言助手。鉴于大型语言模型的广泛能力，我们应该有可能致力于建立一个通用的、基于文本并且与人类价值观一致的助手，它是有帮助的、诚实的、无害的。作为这个方向的初步尝试，本文研究了简单的基线技术和评估，如提示(prompting)。适度干预所带来的好处随着模型的大小而增加，可以推广到各种对齐评价，且不会损害大型模型的性能。调研了与对齐相关的几个训练目标的缩放趋势，比较了模仿学习、二元鉴别和排序偏好模型。发现排序偏好模型的表现比模仿学习好得多，且随着模型规模的扩大，排序偏好模型的表现往往更出色。相比之下，二元鉴别的表现和规模与模仿学习非常相似。最后，研究了"偏好模型预训练"的训练阶段，目的是在对人类偏好进行微调时提高样本效率。

Given the broad capabilities of large language models, it should be possible to work towards a general-purpose, text-based assistant that is aligned with human values, meaning that it is helpful, honest, and harmless. As an initial foray in this direction we study simple baseline techniques and evaluations, such as prompting. We find that the benefits from modest interventions increase with model size, generalize to a variety of alignment evaluations, and do not compromise the performance of large models. Next we investigate scaling trends for several training objectives relevant to alignment, comparing imitation learning, binary discrimination, and ranked preference modeling. We find that ranked preference modeling performs much better than imitation learning, and often scales more favorably with model size. In contrast, binary discrimination typically performs and scales very similarly to imitation learning. Finally we study a ‘preference model pre-training’ stage of training, with the goal of improving sample efficiency when finetuning on human preferences.

https://weibo.com/1402400261/L4AbCgO9P

3、[CV] SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency

D S Chaplot, M Dalal, S Gupta, J Malik, R Salakhutdinov

[Facebook AI Research & CMU & UIUC]

SEAL：基于探索和3D一致性的自监督具身主动学习。本文探讨了如何在互联网图像的数据和模型基础上，利用它们来自适应机器人视觉，而不需要任何额外的标记。提出了自监督主动学习(Self-supervised Embodied Active Learning，SEAL)框架，利用在互联网图像上训练的感知模型来学习一个主动探索策略。这个探索策略所收集的观察结果，使用3D一致性进行标记，并用于改进感知模型。建立并利用3D语义图，以完全自监督的方式学习行为和感知。语义图被用来计算内在的动机奖励，以训练探索策略，用空间-时间3D一致性和标签传播对智能体观察进行标记。证明SEAL框架可以用来完成行为-感知循环：通过在训练环境中的移动就可以提高预训练的感知模型的目标检测和实例分割性能，而改进的感知模型可以用来提高目标目的导航。

In this paper, we explore how we can build upon the data and models of Internet images and use them to adapt to robot vision without requiring any extra labels. We present a framework called Self-supervised Embodied Active Learning (SEAL). It utilizes perception models trained on internet images to learn an active exploration policy. The observations gathered by this exploration policy are labelled using 3D consistency and used to improve the perception model. We build and utilize 3D semantic maps to learn both action and perception in a completely self-supervised manner. The semantic map is used to compute an intrinsic motivation reward for training the exploration policy and for labelling the agent observations using spatio-temporal 3D consistency and label propagation. We demonstrate that the SEAL framework can be used to close the action-perception loop: it improves object detection and instance segmentation performance of a pretrained perception model by just moving around in training environments and the improved perception model can be used to improve Object Goal Navigation.

https://weibo.com/1402400261/L4AfVn8Ol

4、[LG] Editing a classifier by rewriting its prediction rules

S Santurkar, D Tsipras, M Elango, D Bau, A Torralba, A Madry

[MIT]

基于预测规改写的分类器编辑。本文提出一种通过直接改写预测规则来修改分类器行为的方法，用于对视觉分类器进行有针对性的事后修改。该方法使用户在模型调试过程中更容易编码先验知识和偏好，从根本上改变了模型处理一个给定概念的方式——因此有可能在用于编辑的特定类别之外编辑其行为。该方法几乎不需要额外的数据收集，可应用于各种环境，包括使模型自适应于新环境，以及修改它以忽略虚假特征。

We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features.

https://weibo.com/1402400261/L4AjlBgJn

5、[SI] Descriptive vs. inferential community detection: pitfalls, myths and half-truths

T P. Peixoto

[Central European University]

描述性社区检测与推断性社区检测：陷阱、神话和准事实。社区检测是网络科学中最重要的方法论领域之一，在过去的几十年中吸引了大量的关注。这一领域涉及到将网络自动划分为基本构件，目的是提供其大规模结构的摘要。尽管它很重要且被广泛采用，但在被认为是最先进的方法和在各种领域实际使用的方法之间存在明显的差距。本文试图解决这一差距，根据现有的方法是"描述性"的还是"推断性"的目标来进行划分。描述性方法是根据社区结构的直观概念在网络中寻找模式，而推断性方法则阐明了一个精确的生成模型，并试图将其与数据相匹配。通过这种方式，它们能够提供对网络形成机制的洞察力，并以统计证据支持的方式将结构与随机性分开。本文回顾了采用描述性方法进行推断的目的是如何充满了陷阱和误导性的答案，因此一般来说应该避免。推断性方法更典型地与更清晰的科学问题相一致，产生更鲁棒的结果，因此一般来说应该是首选。本文试图破除在实践中采用社区检测时经常相信的一些神话和半真半假的说法，以努力改善这种方法的使用以及对其结果的解释。

Community detection is one of the most important methodological fields of network science, and one which has attracted a significant amount of attention over the past decades. This area deals with the automated division of a network into fundamental building blocks, with the objective of providing a summary of its large-scale structure. Despite its importance and widespread adoption, there is a noticeable gap between what is considered the state-of-the-art and the methods that are actually used in practice in a variety of fields. Here we attempt to address this discrepancy by dividing existing methods according to whether they have a “descriptive” or an “inferential” goal. While descriptive methods find patterns in networks based on intuitive notions of community structure, inferential methods articulate a precise generative model, and attempt to fit it to data. In this way, they are able to provide insights into the mechanisms of network formation, and separate structure from randomness in a manner supported by statistical evidence. We review how employing descriptive methods with inferential aims is riddled with pitfalls and misleading answers, and thus should be in general avoided. We argue that inferential methods are more typically aligned with clearer scientific questions, yield more robust results, and should be in general preferred. We attempt to dispel some myths and half-truths often believed when community detection is employed in practice, in an effort to improve both the use of such methods as well as the interpretation of their results.

https://weibo.com/1402400261/L4Ao6vnni