Medical Spoken Named Entity Recognition

2024年06月19日
  • 简介
    这篇文章介绍了VietMed-NER,这是医学领域中第一个口语命名实体识别数据集。它旨在从语音中提取命名实体并将它们分类为人物、地点、组织等类型。据我们所知,这个真实世界的数据集是目前世界上最大的口语命名实体识别数据集,其实体类型达到了18个。其次,作者使用各种最先进的预训练模型(编码器和序列到序列)提出了基准结果。他们发现,预训练的多语言模型XLM-R在参考文本和自动语音识别输出方面都优于所有单语模型。此外,总体而言,编码器在NER任务中表现更好。通过简单的翻译,转录不仅适用于越南语,也适用于其他语言。所有代码、数据和模型都在此处公开可用:https://github.com/leduckhai/MultiMed。
  • 图表
  • 解决问题
    VietMed-NER is the first spoken NER dataset in the medical domain. The paper aims to extract named entities from speech and categorize them into types like person, location, organization, etc.
  • 关键思路
    The paper presents baseline results using various state-of-the-art pre-trained models: encoder-only and sequence-to-sequence. The authors found that pre-trained multilingual models XLM-R outperformed all monolingual models on both reference text and ASR output. Encoders perform better than sequence-to-sequence models for the NER task.
  • 其它亮点
    VietMed-NER dataset is the largest spoken NER dataset in the world in terms of the number of entity types, featuring 18 distinct types. The transcript is applicable not just to Vietnamese but to other languages as well by simply translating. All code, data and models are made publicly available on GitHub.
  • 相关研究
    Recent related research includes 'Spoken Language Understanding for Conversational AI: A Review' by Bing Liu and 'End-to-end Spoken Language Understanding' by Bing Liu, Ian Lane, and Alan W. Black.
PDF
原文
点赞 收藏 评论 分享到Link

沙发等你来抢

去评论