爱可可AI前沿推介(10.17)

LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

转自：爱可可爱生活

1、[LG] Evaluating Predictive Distributions: Does Bayesian Deep Learning Work?

I Osband, Z Wen, S M Asghari, V Dwaracherla, B Hao, M Ibrahimi, D Lawson, X Lu, B O'Donoghue, B V Roy

[DeepMind]

预测性分布评价：贝叶斯深度学习可用吗？后验预测分布量化了被点估计忽略的不确定性。本文介绍了神经测试平台Neural Testbed，为系统评估产生这种预测的智能体提供了工具。重要的是，这些工具不仅可以评估每个输入的边际预测的质量，还可以评估多个输入的联合预测。联合分布往往是有用的不确定性量化的关键，但它们在很大程度上被贝叶斯深度学习社区忽视了。本文用基于神经网络的数据生成过程，对不确定性估计的几种方法进行基准测试。实验结果揭示了超越边际预测的评价的重要性。此外，它们还调和了该领域的混乱来源，例如为什么产生准确边际预测的贝叶斯深度学习方法在连续决策任务中表现不佳，纳入先验因素如何有帮助，以及在评估性能时，认识性不确定性与回避性不确定性起什么作用。本文还介绍了在真实世界的挑战数据集上的实验，这些数据集显示了与测试平台结果的高度相关性，并且评估联合预测分布的重要性延伸到了真实数据。

Posterior predictive distributions quantify uncertainties ignored by point estimates. This paper introduces The Neural Testbed, which provides tools for the systematic evaluation of agents that generate such predictions. Crucially, these tools assess not only the quality of marginal predictions per input, but also joint predictions given many inputs. Joint distributions are often critical for useful uncertainty quantification, but they have been largely overlooked by the Bayesian deep learning community. We benchmark several approaches to uncertainty estimation using a neural-network-based data generating process. Our results reveal the importance of evaluation beyond marginal predictions. Further, they reconcile sources of confusion in the field, such as why Bayesian deep learning approaches that generate accurate marginal predictions perform poorly in sequential decision tasks, how incorporating priors can be helpful, and what roles epistemic versus aleatoric uncertainty play when evaluating performance. We also present experiments on real-world challenge datasets, which show a high correlation with testbed results, and that the importance of evaluating joint predictive distributions carries over to real data. As part of this effort, we opensource The Neural Testbed, including all implementations from this paper.

https://weibo.com/1402400261/KD5uXFNt6

2、[LG] Nonnegative spatial factorization

F. W Townes, B E. Engelhardt

[Princeton University & Gladstone Institutes]

非负空间分解。高斯过程由于其非参数的灵活性和量化不确定性的能力而被广泛用于空间数据分析，最近开发的可扩展的近似方法促进了对大规模数据集的应用。对于多变量结果，核心区域化的线性模型结合了降维和空间相关性。然而，他们的实值潜因子和负荷很难解释，因为与非负模型不同，他们没有恢复基于部分的表示。本文提出了非负空间分解(NSF)，一种基于高斯过程(GP)对计数数据的观测进行空间感知降维的概率模型，自然鼓励稀疏性。使用模拟和高维空间转录组学数据，将NSF与实值空间分解(如MEFISTO)和非空间降维方法进行比较。NSF确定了基因表达的可泛化的空间模式。由于并非所有基因表达模式都是空间性的，还提出了NSF的混合扩展，结合了空间和非空间成分，使观察和特征的空间重要性得到量化。

Gaussian processes are widely used for the analysis of spatial data due to their nonparametric flexibility and ability to quantify uncertainty, and recently developed scalable approximations have facilitated application to massive datasets. For multivariate outcomes, linear models of coregionalization combine dimension reduction with spatial correlation. However, their real-valued latent factors and loadings are difficult to interpret because, unlike nonnegative models, they do not recover a parts-based representation. We present nonnegative spatial factorization (NSF), a spatially-aware probabilistic dimension reduction model that naturally encourages sparsity. We compare NSF to real-valued spatial factorizations such as MEFISTO (Velten et al., 2020) and nonspatial dimension reduction methods using simulations and high-dimensional spatial transcriptomics data. NSF identifies generalizable spatial patterns of gene expression. Since not all patterns of gene expression are spatial, we also propose a hybrid extension of NSF that combines spatial and nonspatial components, enabling quantification of spatial importance for both observations and features.

https://weibo.com/1402400261/KD5AD1oqk

3、[RO] Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

L Smith, J. C Kew, X B Peng, S Ha, J Tan, S Levine

[UC Berkeley & Google Research]

持续学习的四足机器人：在现实世界对运动策略进行微调。四足机器人具有穿越广泛的挑战性环境的物理能力，但设计足够强大的控制器来处理这种多样性，一直是机器人技术中的一个长期挑战。强化学习为控制器设计过程的自动化提供了一种有吸引力的方法，并且在适当的环境中进行训练时，能够产生明显的鲁棒性控制器。然而，要预测机器人在部署过程中可能遇到的所有情况并在训练时列举出来是很困难的。如果不训练足以应对任何情况的控制器，而是让机器人在任何环境中不断地学习，那会怎样？这种现实世界的强化学习带来了许多挑战，包括效率、安全和自主性。为了应对这些挑战，本文提出了一个实用的机器人强化学习系统，用于在现实世界中微调运动策略。在部署过程中，适量的真实世界训练可以大幅提高性能，这使得一个真实的A1四足机器人能够在一系列环境中自主地微调多种运动技能，包括室外草坪和各种室内地形。

Legged robots are physically capable of traversing a wide range of challenging environments, but designing controllers that are sufficiently robust to handle this diversity has been a long-standing challenge in robotics. Reinforcement learning presents an appealing approach for automating the controller design process and has been able to produce remarkably robust controllers when trained in a suitable range of environments. However, it is difficult to predict all likely conditions the robot will encounter during deployment and enumerate them at training-time. What if instead of training controllers that are robust enough to handle any eventuality, we enable the robot to continually learn in any setting it finds itself in? This kind of real-world reinforcement learning poses a number of challenges, including efficiency, safety, and autonomy. To address these challenges, we propose a practical robot reinforcement learning system for fine-tuning locomotion policies in the real world. We demonstrate that a modest amount of real-world training can substantially improve performance during deployment, and this enables a real A1 quadrupedal robot to autonomously fine-tune multiple locomotion skills in a range of environments, including an outdoor lawn and a variety of indoor terrains.

https://weibo.com/1402400261/KD5GhbUgS

4、[LG] Embedded-model flows: Combining the inductive biases of model-free deep learning and explicit probabilistic modeling

G Silvestri, E Fertig, D Moore, L Ambrogioni

[OnePlanet Research Center & Google Research & Donders Institute for Brain]

嵌入模型流：结合无模型深度学习和显式概率建模的归纳偏见。作为通用密度估计器，标准化流已经显示出巨大的成功。然而，许多现实世界的应用需要使用特定领域的知识，而标准化流不能轻易纳入。本文提出嵌入模型流(EMF)，用嵌入特定领域归纳偏见的结构层来替代通用变换。这些层是通过将用户指定的可微概率模型转换为等效双射变换而自动构建的。引入了门控结构层，允许绕过模型中未能捕捉到数据统计的部分。证明了EMF可以用来诱导理想属性，如多模态、分层耦合和连续性。表明EMF能实现高性能变分推理，其中先验模型的结构被嵌入变分结构中。实验表明这种方法在常见的结构化推理问题上优于最先进的方法。

Normalizing flows have shown great success as general-purpose density estimators. However, many real world applications require the use of domain-specific knowledge, which normalizing flows cannot readily incorporate. We propose embedded-model flows (EMF), which alternate general-purpose transformations with structured layers that embed domain-specific inductive biases. These layers are automatically constructed by converting user-specified differentiable probabilistic models into equivalent bijective transformations. We also introduce gated structured layers, which allow bypassing the parts of the models that fail to capture the statistics of the data. We demonstrate that EMFs can be used to induce desirable properties such as multimodality, hierarchical coupling and continuity. Furthermore, we show that EMFs enable a high performance form of variational inference where the structure of the prior model is embedded in the variational architecture. In our experiments, we show that this approach outperforms state-of-the-art methods in common structured inference problems.

https://weibo.com/1402400261/KD5JeqjYE

5、[LG] The Neural MMO Platform for Massively Multiagent Research

J Suarez, Y Du, C Zhu, I Mordatch, P Isola

[MIT & Stanford University & Google Brain]

面向大规模多智能体研究的神经MMO平台。神经MMO是一个可计算研究平台，结合了大规模智能体群体、长时间范围、开放式任务和模块化游戏系统。现有的环境具有这些属性的子集，但Neural MMO是第一个将它们全部结合起来的环境。将神经MMO作为免费开源软件，并提供积极的支持、持续的开发、文档以及额外的训练、记录和可视化工具，以帮助用户适应这个新环境。该平台的初始基线表明，在大群体中训练的智能体会探索更多的东西，并学习进阶的技能。本文提出了其他更困难的问题，如多团队合作，作为开放的研究问题，神经MMO非常适合回答。最后，讨论了该平台目前的局限性、潜在的缓解措施以及继续发展的计划。

Neural MMO is a computationally accessible research platform that combines large agent populations, long time horizons, open-ended tasks, and modular game systems. Existing environments feature subsets of these properties, but Neural MMO is the first to combine them all. We present Neural MMO as free and open source software with active support, ongoing development, documentation, and additional training, logging, and visualization tools to help users adapt to this new setting. Initial baselines on the platform demonstrate that agents trained in large populations explore more and learn a progression of skills. We raise other more difficult problems such as many-team cooperation as open research questions which Neural MMO is well-suited to answer. Finally, we discuss current limitations of the platform, potential mitigations, and plans for continued development.

https://weibo.com/1402400261/KD5Ne3Pmr