用于解析的学习增量表示

来自今天的爱可可AI前沿推介

[CL] Learned Incremental Representations for Parsing

N Kitaev, T Lu, D Klein
[UC Berkeley]

面向解析的习得增量表示

要点：

提出一种递增句法表示，为句中每个词分配单个离散标签；
习得表示仅用5位每词就达到Penn Treebank上的高F1值；
该系统系统可提高对递增解析和顺序决策的理解。

摘要：
提出了一种递增的句法表示方法，包括给句子中的每个词分配一个离散的标签，其中标签是通过对句子前缀的严格递增处理来预测的，而句子标签序列完全决定了解析树。目标是归纳出一种句法表示，该表示只在输入逐渐揭示的情况下致力于句法选择，与标准表示不同，后者必须推测性地做出输出选择，如附件，然后抛出冲突的分析。所学习的表示在每个词只有5比特的情况下在Penn Treebank上达到了93.72的F1，而在每个词有8比特的情况下，它们达到了94.97的F1，这与其他最先进的解析模型在使用相同的预训练嵌入时是相当的。对该系统所学到的表示进行了分析，研究了诸如系统所捕获的可解释的句法特征和句法歧义的延迟解决机制等属性。

We present an incremental syntactic representation that consists of assigning a single discrete label to each word in a sentence, where the label is predicted using strictly incremental processing of a prefix of the sentence, and the sequence of labels for a sentence fully determines a parse tree. Our goal is to induce a syntactic representation that commits to syntactic choices only as they are incrementally revealed by the input, in contrast with standard representations that must make output choices such as attachments speculatively and later throw out conflicting analyses. Our learned representations achieve 93.72 F1 on the Penn Treebank with as few as 5 bits per word, and at 8 bits per word they achieve 94.97 F1, which is comparable with other state of the art parsing models when using the same pre-trained embeddings. We also provide an analysis of the representations learned by our system, investigating properties such as the interpretable syntactic features captured by the system and mechanisms for deferred resolution of syntactic ambiguities.

论文链接：https://aclanthology.org/2022.acl-long.220/

内容中包含的图片若涉及版权问题，请及时与我们联系删除

用于解析的学习增量表示

[CL] Learned Incremental Representations for Parsing

评论