Mixup 是⼀种简单且有效的数据增强⽅法,⾃2018年MIT和facebook提出之后,⽆论在业界还是在学术界都有了很强的地位,成为⼤家的⼀种标配。下⾯就从开⼭之作逐步简单的介绍下如何在NLP领域使⽤的吧。

具体的论文包括:

1. Mixup

论文标题: mixup: BEYOND EMPIRICAL RISK MINIMIZATION -- ICLR2018 

论文地址: https://arxiv.org/pdf/1710.09412.pdf

2. wordMixup 和 senMixup

论文标题:  Augmenting Data with Mixup for Sentence Classification: An Empirical Study -- 2019 arxiv

3. Manifold Mixup

论文标题: Manifold Mixup: Better Representations by Interpolating Hidden States -ICML2019 
论文地址: https://arxiv.org/pdf/1806.05236.pdf
代码: https://github.com/vikasverma1077/manifold_mixup

4. Mixup-Transformer

论文标题: Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks -COLING2020

论文地址: https://arxiv.org/pdf/2010.02394.pdf

5. TMix

论文标题: MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification -- ACL2021 

论文地址: https://arxiv.org/pdf/2004.12239.pdf 

代码: https://github.com/GT-SALT/MixText

6. SeqMix

论文标题: SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup -- EMNLP2020 

论文地址: https://arxiv.org/pdf/2010.02322.pdf 

代码: https://github.com/rz-zhang/SeqMix

7. SSMix

论文标题: SSMix: Saliency-Based Span Mixup for Text Classification -- ACL2021

论文地址: https://arxiv.org/pdf/2106.08062.pdf

代码: https://github.com/clovaai/ssmix

内容中包含的图片若涉及版权问题,请及时与我们联系删除