LG - 机器学习   CV - 计算机视觉   CL - 计算与语言   AS - 音频与语音 RO - 机器人




1、[LG] Learning to Learn with Generative Models of Neural Network Checkpoints

W Peebles, I Radosavovic, T Brooks, A A. Efros, J Malik
[UC Berkeley]

We explore a data-driven approach for learning to optimize neural networks. We construct a dataset of neural network checkpoints and train a generative model on the parameters. In particular, our model is a conditional diffusion transformer that, given an initial input parameter vector and a prompted loss, error, or return, predicts the distribution over parameter updates that achieve the desired metric. At test time, it can optimize neural networks with unseen parameters for downstream tasks in just one update. We find that our approach successfully generates parameters for a wide range of loss prompts. Moreover, it can sample multimodal parameter solutions and has favorable scaling properties. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.



2、[CL] News Summarization and Evaluation in the Era of GPT-3

T Goyal, J J Li, G Durrett
[The University of Texas at Austin]

The recent success of zeroand few-shot prompting with models like GPT-3 has led to a paradigm shift in NLP research. In this paper, we study its impact on text summarization, focusing on the classic benchmark domain of news summarization. First, we investigate how zero-shot GPT-3 compares against finetuned models trained on large summarization datasets. We show that not only do humans overwhelmingly prefer GPT-3 summaries, but these also do not suffer from common datasetspecific issues such as poor factuality. Next, we study what this means for evaluation, particularly the role of gold standard test sets. Our experiments show that both reference-based and reference-free automatic metrics, e.g. recently proposed QAor entailment-based factuality approaches, cannot reliably evaluate zero-shot summaries. Finally, we discuss future research challenges beyond generic summarization, specifically, keywordand aspectbased summarization, showing how dominant fine-tuning approaches compare to zero-shot prompting. To support further research, we release: (a) a corpus of 10K generated summaries from fine-tuned and zero-shot models across 4 standard summarization benchmarks, (b) 1K human preference judgments and rationales comparing different systems for genericand keyword-based summarization.



3、[LG] Pre-training via Denoising for Molecular Property Prediction

S Zaidi, M Schaarschmidt, J Martens, H Kim, Y W Teh, A Sanchez-Gonzalez, P Battaglia, R Pascanu, J Godwin
[University of Oxford & DeepMind]

Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks. In this paper, we describe a pre-training technique that utilizes large datasets of 3D molecular structures at equilibrium to learn meaningful representations for downstream tasks. Inspired by recent advances in noise regularization, our pre-training objective is based on denoising. Relying on the well-known link between denoising autoencoders and score-matching, we also show that the objective corresponds to learning a molecular force field – arising from approximating the physical state distribution with a mixture of Gaussians – directly from equilibrium structures. Our experiments demonstrate that using this pre-training objective significantly improves performance on multiple benchmarks, achieving a new state-of-the-art on the majority of targets in the widely used QM9 dataset. Our analysis then provides practical insights into the effects of different factors – dataset sizes, model size and architecture, and the choice of upstream and downstream datasets – on pre-training.



4、[CL] Entailment Semantics Can Be Extracted from an Ideal Language Model

W Merrill, A Warstadt, T Linzen
[New York University & ETH Zürich]

Language models are often trained on text alone, without additional grounding. There is debate as to how much of natural language semantics can be inferred from such a procedure. We prove that entailment judgments between sentences can be extracted from an ideal language model that has perfectly learned its target distribution, assuming the training sentences are generated by Gricean agents, i.e., agents who follow fundamental principles of communication from the linguistic theory of pragmatics. We also show entailment judgments can be decoded from the predictions of a language model trained on such Gricean data. Our results reveal a pathway for understanding the semantic information encoded in unlabeled linguistic data and a potential framework for extracting semantics from language models.



5、[RO] Advanced Skills by Learning Locomotion and Local Navigation End-to-End

N Rudin, D Hoeller, M Bjelonic, M Hutter
[ETH Zurich]

The common approach for local navigation on challenging environments with legged robots requires path planning, path following and locomotion, which usually requires a locomotion control policy that accurately tracks a commanded velocity. However, by breaking down the navigation problem into these sub-tasks, we limit the robot’s capabilities since the individual tasks do not consider the full solution space. In this work, we propose to solve the complete problem by training an end-to-end policy with deep reinforcement learning. Instead of continuously tracking a precomputed path, the robot needs to reach a target position within a provided time. The task’s success is only evaluated at the end of an episode, meaning that the policy does not need to reach the target as fast as possible. It is free to select its path and the locomotion gait. Training a policy in this way opens up a larger set of possible solutions, which allows the robot to learn more complex behaviors. We compare our approach to velocity tracking and additionally show that the time dependence of the task reward is critical to successfully learn these new behaviors. Finally, we demonstrate the successful deployment of policies on a real quadrupedal robot. The robot is able to cross challenging terrains, which were not possible previously, while using a more energy-efficient gait and achieving a higher success rate. Supplementary videos can be found on the project website: https://sites.google.com/ leggedrobotics.com/end-to-end-loco-navigation





[LG] Unsupervised Model-based Pre-training for Data-efficient Control from Pixels

S Rajeswar, P Mazzaglia, T Verbelen, A Piché, B Dhoedt, A Courville, A Lacoste
[Mila & Ghent University & ServiceNow Research] https://arxiv.org/abs/2209.12016


[CV] FastStamp: Accelerating Neural Steganography and Digital Watermarking of Images on FPGAs

FastStamp: FPGA上的神经隐写和数字水印加速
S Hussain, N Sheybani, P Neekhara, X Zhang, J Duarte, F Koushanfar
[UC San Diego]


[LG] In-context Learning and Induction Heads

C Olsson, N Elhage, N Nanda, N Joseph…


[LG] Emergence in artificial life

C Gershenson
[Universidad Nacional Autonoma de Mexico] https://arxiv.org/abs/2105.03216


