LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

转自爱可可爱生活

 

1、[LG] GenéLive! Generating Rhythm Actions in Love Live!

A Takada, D Yamazaki, L Liu, Y Yoshida, N Ganbat, T Shimotomai, T Yamamoto, D Sakurai, N Hamada

[KLab Inc & Kyushu University]

Love Live!游戏节奏动作图生成。节奏动作游戏是一种基于音乐的视频游戏,玩家在游戏中面临的挑战是在音乐环节中按正确的时间点发出指令。时机在图中呈现,由视觉符号组成,称为音符,在屏幕上飞舞。KLab公司是一家位于日本的视频游戏开发商,运营过包括在亚洲和其他地区成为热门的"Love Live"系列节奏动作游戏。本文工作之前,主要是手动生成,这导致了昂贵的制作成本。本文介绍了KLab是如何用深度生成模型来合成节奏动作图的,并展示了它如何改善节奏动作图的生产过程,将业务成本降低了一半。现有的生成模型为较容易的难度模式生成质量较差的图。本文报告了如何通过一个专用于节奏动作的多尺度模型,通过考虑节拍等因素,克服这一挑战。所提出模型命名为GenéLive!,用KLab的生产数据集以及开放数据集进行了评估,填补了较容易和较难游戏模式之间的质量差距。

A rhythm action game is a music-based video game in which the player is challenged to issue commands at the right timings during a music session. The timings are rendered in the chart, which consists of visual symbols, called notes, flying through the screen. KLab Inc., a Japan-based video game developer, has operated rhythm action games including a title for the “Love Live!” franchise, which became a hit across Asia and beyond. Before this work, the company generated the charts manually, which resulted in a costly business operation. This paper presents how KLab applied a deep generative model for synthesizing charts, and shows how it has improved the chart production process, reducing the business cost by half. Existing generative models generated poor quality charts for easier difficulty modes [3]. We report how we overcame this challenge through a multi-scaling model dedicated to rhythm actions, by considering beats among other things. Our model, named GenéLive!, is evaluated using production datasets at KLab as well as open datasets.

 

 

2、[CL] Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

S Min, X Lyu, A Holtzman, M Artetxe, M Lewis, H Hajishirzi, L Zettlemoyer

[University of Washington & Meta AI]

示范样本如何在上下文学习中发挥作用?大型语言模型(LM)能进行上下文学习——通过以几个输入标签对(演示)为条件并对新输入进行预测,仅通过推理来执行新任务。然而,人们对模型如何学习以及示范的哪些方面有助于最终任务表现了解甚少。本文表明,事实上不需要真实标准示范——随机替换示范中的标签几乎不会对性能有损,在包括GPT-3在内的12个不同模型中都是如此。反倒是示范的其他方面是最终任务性能的关键驱动因素,包括它们提供了一些样本:(1)标签空间,(2)输入文本分布,以及(3)序列整体格式。总之,所提出分析提供了一种新的方式来理解上下文学习是如何以及为什么工作的,同时开辟了新的问题,即仅通过推理就可以从大型语言模型中学习到多少东西。

Large language models (LMs) are able to in-context learn—perform a new task via inference alone by conditioning on a few inputlabel pairs (demonstrations) and making predictions for new inputs. However, there has been little understanding of how the model learns and which aspects of the demonstrations contribute to end task performance. In this paper, we show that ground truth demonstrations are in fact not required—randomly replacing labels in the demonstrations barely hurts performance, consistently over 12 different models including GPT-3. Instead, we find that other aspects of the demonstrations are the key drivers of end task performance, including the fact that they provide a few examples of (1) the label space, (2) the distribution of the input text, and (3) the overall format of the sequence. Together, our analysis provides a new way of understanding how and why in-context learning works, while opening up new questions about how much can be learned from large language models through inference alone.

 

 

3、[CV] Common Limitations of Image Processing Metrics: A Picture Story

A Reinke, M Eisenmann, M D. Tizabi...

[German Cancer Research Center (DKFZ) & King’s College London & McGill University & University of Pennsylvania]

图像处理指标常见局限性。虽然自动图像分析的重要性正在快速增长,但最近的元研究揭示了算法验证方面的重大缺陷。具体来说,性能指标是客观、透明和可比较的性能评估的关键,但相对来说,很少有人注意到在给定的图像分析任务中使用特定指标时的实际缺陷。因此,一些国际倡议的一个共同任务是为研究人员提供指导方针和工具,以便以一种意识到问题的方式选择性能指标。本文旨在说明图像分析领域中常用的性能指标的重要局限性。通过这篇(动态)论文,希望提高人们对图像处理领域中最常用指标的一些常见缺陷的认识,鼓励研究人员重新考虑常见的工作流。

While the importance of automatic image analysis is increasing at an enormous pace, recent meta-research revealed major flaws with respect to algorithm validation. Specifically, performance metrics are key for objective, transparent and comparative performance assessment, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. A common mission of several international initiatives is therefore to provide researchers with guidelines and tools to choose the performance metrics in a problem-aware manner. This dynamically updated document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts.

 

 

4、[LG] Exploring Classic Quantitative Strategies

J Lu

经典量化策略探讨。本文的目标是驳斥和消除黑箱量化策略背后的魔力,目的是为这些技术的运作方式和原因打下坚实的基础。本文通过从简单直觉推导出策略背后的数学知识,将这些知识具体化。本教程并不回避处理量化策略的正式和非正式方面,希望为读者提供对这些技术以及应用这些技术的时间、方式和原因的更深入的理解。这些策略是以S&P500和SH510300的数据集来介绍的。然而,测试的结果只是说明这些方法是如何运作的,并没有对真实市场状况的建议作出任何主张。

The goal of this paper is to debunk and dispel the magic behind the black-box quantitative strategies. It aims to build a solid foundation on how and why the techniques work. This manuscript crystallizes this knowledge by deriving from simple intuitions, the mathematics behind the strategies. This tutorial doesn’t shy away from addressing both the formal and informal aspects of quantitative strategies. By doing so, it hopes to provide readers with a deeper understanding of these techniques as well as the when, the how and the why of applying these techniques. The strategies are presented in terms of both S&P500 and SH510300 data sets. However, the results from the tests are just examples of how the methods work; no claim is made on the suggestion of real market positions.

 

 

5、[LG] Overcoming a Theoretical Limitation of Self-Attention

D Chiang, P Cholak

[University of Notre Dame]

克服自注意力理论局限性。尽管Transformer对许多任务都非常有效,但有一些看起来很容易的常规语言,它们却很难处理。Hahn定理表明,对于那些接受程度取决于单一输入符号的语言,当输入字符串越来越长时,Transformer的分类决策会变得越来越不可信(也就是说,交叉熵接近1比特每字符串)。本文用两种语言来研究该限制。PARITY,即有奇数个1的比特串的语言,以及FIRST,即以1开头的比特串的语言。展示了三种克服Hahn定理所提出的限制的方法。首先,解决了一个开放性问题,构建了一个能完全准确识别PARITY的Transformer,对FIRST也是如此。第二,用层归一化使两个模型的交叉熵任意地接近于零。第三,当transformer需要关注单一位置时,如FIRST,发现它们可能无法泛化到较长字符串;为这个问题提供了一个简单的补救措施,改善了机器翻译中的长度泛化。

Although transformers are remarkably effective for many tasks, there are some surprisingly easy-looking regular languages that they struggle with. Hahn shows that for languages where acceptance depends on a single input symbol, a transformer’s classification decisions become less and less confident (that is, with crossentropy approaching 1 bit per string) as input strings get longer and longer. We examine this limitation using two languages: PARITY, the language of bit strings with an odd number of 1s, and FIRST, the language of bit strings starting with a 1. We demonstrate three ways of overcoming the limitation suggested by Hahn’s lemma. First, we settle an open question by constructing a transformer that recognizes PARITY with perfect accuracy, and similarly for FIRST. Second, we use layer normalization to bring the cross-entropy of both models arbitrarily close to zero. Third, when transformers need to focus on a single position, as for FIRST, we find that they can fail to generalize to longer strings; we offer a simple remedy to this problem that also improves length generalization in machine translation.

 

 

另外几篇值得关注的论文:

 

[LG] Do Feature Attribution Methods Correctly Attribute Features?

特征归因方法能否正确地归因特征?

Y Zhou, S Booth, M T Ribeiro, J Shah

[MIT & Microsoft Research]

 

 

[CL] A New Generation of Perspective API: Efficient Multilingual Character-level Transformers

新一代 Perspective API:高效多语言字符级Transformer

A Lees, V Q. Tran, Y Tay, J Sorensen, J Gupta, D Metzler, L Vasserman

[Jigsaw & Google Research]

 

 

[IR] Matching Papers and Reviewers at Large Conferences

大型会议论文审稿人匹配

K Leyton-Brown, Mausam, Y Nandwani, H Zarkoob, C Cameron, N Newman, D Raghu

[University of British Columbia & Indian Institute of Technology Delhi]

 

 

[CL] Morphology Without Borders: Clause-Level Morphological Annotation

无界形态学:子句级形态学标注

O Goldman, R Tsarfaty

[Bar Ilan University]

 

 

内容中包含的图片若涉及版权问题,请及时与我们联系删除