爱可可AI前沿推介(11.11)

LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

转自爱可可爱生活

1、[LG] Partition and Code: learning how to compress graphs

G Bouritsas, A Loukas, N Karalias, M M. Bronstein

[Imperial College London & EPFL]

分区与编码：图压缩学习。能用机器学习来压缩图数据吗？图中没有排序，这给传统的压缩算法带来了巨大挑战，限制了其可达到的收益，以及发现相关模式的能力。另一方面，大多数图压缩方法依赖于领域特定的手工表示，不能适应不同的基础图分布。这项工作的目的是建立一个无损图压缩方法应该遵循的必要原则，以接近熵存储下界。本文没有对图的分布做硬性的假设，而是将压缩器制定为一个概率模型，可以从数据中学习，并对未见过的实例进行泛化。所提出的"分区与编码"框架包括三个步骤：首先，一个分区算法将图分解成子图，然后将这些子图映射到一个小字典的元素上，在这个字典上学习一个概率分布，最后，一个熵编码器将表示法转化为比特。所有的组成部分(分区、字典和分布)都是参数化的，可以用梯度下降法训练。从理论上比较了几种图编码的压缩质量，并证明在温和条件下，PnC实现的压缩收益随着顶点数量的增加而呈线性或四倍增长。从经验上看，PnC在不同的现实世界网络中产生了明显的压缩改进。

Can we use machine learning to compress graph data? The absence of ordering in graphs poses a significant challenge to conventional compression algorithms, limiting their attainable gains as well as their ability to discover relevant patterns. On the other hand, most graph compression approaches rely on domain-dependent handcrafted representations and cannot adapt to different underlying graph distributions. This work aims to establish the necessary principles a lossless graph compression method should follow to approach the entropy storage lower bound. Instead of making rigid assumptions about the graph distribution, we formulate the compressor as a probabilistic model that can be learned from data and generalise to unseen instances. Our “Partition and Code” framework entails three steps: first, a partitioning algorithm decomposes the graph into subgraphs, then these are mapped to the elements of a small dictionary on which we learn a probability distribution, and finally, an entropy encoder translates the representation into bits. All the components (partitioning, dictionary and distribution) are parametric and can be trained with gradient descent. We theoretically compare the compression quality of several graph encodings and prove, under mild conditions, that PnC achieves compression gains that grow either linearly or quadratically with the number of vertices. Empirically, PnC yields significant compression improvements on diverse real-world networks.1

https://weibo.com/1402400261/L0VCE1GrM

2、[CV] Data Augmentation Can Improve Robustness

S Rebuffi, S Gowal, D A. Calian, F Stimberg, O Wiles, T Mann

[DeepMind]

用数据增强提高鲁棒性。对抗训练受到鲁棒过拟合的影响，就是鲁棒性测试精度在训练期间开始下降的现象。本文专注于通过用常见的数据增强方案来减少鲁棒过拟合。与之前的研究结果相反，当与模型权重平均化相结合时，数据增强可以显著提高鲁棒性精度。比较了各种数据增强技术，发现空间组合技术对对抗性训练效果最好。在CIFAR-10上评估了所提出方法，与以前的最先进的方法相比，高鲁棒性精度有很大的绝对改善。模型在不使用任何外部数据的情况下达到了60.07%的鲁棒性精度。

Adversarial training suffers from robust overfitting, a phenomenon where the robust test accuracy starts to decrease during training. In this paper, we focus on reducing robust overfitting by using common data augmentation schemes. We demonstrate that, contrary to previous findings, when combined with model weight averaging, data augmentation can significantly boost robust accuracy. Furthermore, we compare various data augmentations techniques and observe that spatial composition techniques work best for adversarial training. Finally, we evaluate our approach on CIFAR-10 against ∞ and2 norm-bounded perturbations of size = 8/255 and = 128/255, respectively. We show large absolute improvements of +2.93% and +2.16% in robust accuracy compared to previous state-of-the-art methods. In particular, against `∞ norm-bounded perturbations of size = 8/255, our model reaches 60.07% robust accuracy without using any external data. We also achieve a significant performance boost with this approach while using other architectures and datasets such as CIFAR-100, SVHN and TINYIMAGENET.

https://weibo.com/1402400261/L0VJItS4O

3、[LG] Structure-aware generation of drug-like molecules

P Drotár, A R Jamasb, B Day, C Cangea, P Liò

[University of Cambridge]

结构感知类药物分子生成。基于结构的药物设计涉及寻找与蛋白质袋表现出结构和化学互补性的配体分子。深度生成方法在从头开始提出新的分子(de-novo设计)，避免详尽的化学空间的虚拟筛选方面显示出前景。大多数生成性的de-novo模型未能纳入详细的配体-蛋白质相互作用和3D袋结构。本文提出一种新的监督模型，在离散分子空间中联合3D姿态共同生成分子图。在来自晶体学数据的结构信息的指导下，分子在袋内被逐原子构建。用docking基准对模型进行了评估，发现在指导下生成的模型比基线提高了8%的预测结合亲和力和10%的药物相似性分数。模型提出的分子的结合分数超过了一些已知的配体，这在未来的湿实验室研究中可能是有用的。

Structure-based drug design involves finding ligand molecules that exhibit structural and chemical complementarity to protein pockets. Deep generative methods have shown promise in proposing novel molecules from scratch (de-novo design), avoiding exhaustive virtual screening of chemical space. Most generative de-novo models fail to incorporate detailed ligand-protein interactions and 3D pocket structures. We propose a novel supervised model that generates molecular graphs jointly with 3D pose in a discretised molecular space. Molecules are built atom-by-atom inside pockets, guided by structural information from crystallographic data. We evaluate our model using a docking benchmark and find that guided generation improves predicted binding affinities by 8% and drug-likeness scores by 10% over the baseline. Furthermore, our model proposes molecules with binding scores exceeding some known ligands, which could be useful in future wet-lab studies.

https://weibo.com/1402400261/L0VNLnom1

4、[LG] Dual Parameterization of Sparse Variational Gaussian Processes

V Adam, P E. Chang, M E Khan, A Solin

[Aalto University & RIKEN Center for AI Project]

稀疏变分高斯过程双参数化。稀疏变分高斯过程(SVGP)方法是非共轭高斯过程推断的一个常见选择，因为它们具有计算上的优势。本文通过用双参数化来提高其计算效率，在双参数化中，每个数据样本都被分配了双参数，类似于期望传播中的站参数。双参数化用自然梯度下降加快了推理速度，并为超参数学习提供了更严格的证据下界。该方法与目前的SVGP方法具有相同的内存成本，但它更快、更准确。

Sparse variational Gaussian process (SVGP) methods are a common choice for non-conjugate Gaussian process inference because of their computational benefits. In this paper, we improve their computational efficiency by using a dual parameterization where each data example is assigned dual parameters, similarly to site parameters used in expectation propagation. Our dual parameterization speeds-up inference using natural gradient descent, and provides a tighter evidence lower bound for hyperparameter learning. The approach has the same memory cost as the current SVGP methods, but it is faster and more accurate.

https://weibo.com/1402400261/L0VSI5F75

5、[CV] Recognizing Vector Graphics without Rasterization

X Jiang, L Liu, C Shan, Y Shen, X Dong, D Li

[Microsoft Research Asia & University of Technology Sydney]

非光栅化矢量图识别。本文考虑一种不同的图像数据格式：矢量图。与广泛用于图像识别的光栅图形相比，由于文件中基元的分析性表示，矢量图可以被放大或缩小到任何分辨率而不产生混叠或信息损失。此外，矢量图形能够提供额外的结构信息，说明低级元素如何组合在一起形成高级形状或结构。在现有的方法中，图形矢量的这些优点还没有得到充分的利用。为了探索这种数据格式，本文把目标放在基本的识别任务上：目标定位和分类。本文提出一种高效的非CNN管道，不将图形渲染成像素(光栅化)，而将矢量图的文本文件作为输入，称为YOLaT(You Only Look at Text)。YOLaT建立了多图来模拟矢量图的结构和空间信息，并提出了一种双流图神经网络来检测图中的目标。实验表明，通过直接对矢量图进行操作，YOLaT在平均精度和效率方面都优于基于光栅图形的目标检测基线。

In this paper, we consider a different data format for images: vector graphics. In contrast to raster graphics which are widely used in image recognition, vector graphics can be scaled up or down into any resolution without aliasing or information loss, due to the analytic representation of the primitives in the document. Furthermore, vector graphics are able to give extra structural information on how low-level elements group together to form high level shapes or structures. These merits of graphic vectors have not been fully leveraged in existing methods. To explore this data format, we target on the fundamental recognition tasks: object localization and classification. We propose an efficient CNN-free pipeline that does not render the graphic into pixels (i.e. rasterization), and takes textual document of the vector graphics as input, called YOLaT (You Only Look at Text). YOLaT builds multi-graphs to model the structural and spatial information in vector graphics, and a dual-stream graph neural network is proposed to detect objects from the graph. Our experiments show that by directly operating on vector graphics, YOLaT outperforms raster-graphic based object detection baselines in terms of both average precision and efficiency.

https://weibo.com/1402400261/L0VUZ2Fsv

另外几篇值得关注的论文：

[CV] TermiNeRF: Ray Termination Prediction for Efficient Neural Rendering

TermiNeRF：面向高效神经渲染的光线终止预测

M Piala, R Clark

[Imperial College London]

https://weibo.com/1402400261/L0VY9haZD

[LG] Generalization in quantum machine learning from few training data

基于少量训练数据的量子机器学习泛化

M C. Caro, H Huang, M. Cerezo, K Sharma, A Sornborger, L Cincio, P J. Coles

[Technical University of Munich & Institute for Quantum Information and Matter...]

https://weibo.com/1402400261/L0W06aOER

[RO] Learning Perceptual Concepts by Bootstrapping from Human Queries

通过人工查询引导感知概念学习

A Bobu, C Paxton, W Yang, B Sundaralingam, Y Chao, M Cakmak, D Fox

[UC Berkeley & NVIDIA Robotics]

https://weibo.com/1402400261/L0W1HDUks

[CL] Speaker Generation

说话人生成

D Stanton, M Shannon, S Mariooryad, R Skerry-Ryan, E Battenberg, T Bagby, D Kao

[Google Research]

https://weibo.com/1402400261/L0W3fbbRs

内容中包含的图片若涉及版权问题，请及时与我们联系删除