提示学习框架下融合多层级特征信息的中文命名实体识别
作者:
作者单位:

1.北京建筑大学机电与车辆工程学院,北京 102616;2.北京建筑大学电气与信息工程学院,北京 102616

作者简介:

通讯作者:

基金项目:

教育部人文社会科学研究一般项目(22YJAZH110);北京市教育科学“十四五”规划项目(CHAA22061)。


Chinese Named Entity Recognition Based on Prompt Learning and Multi-level Feature Fusion
Author:
Affiliation:

1.School of Mechanical-Electronic and Vehicle Engineering, Beijing University of Civil Engineering and Architecture, Beijing 102616, China;2.School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 102616, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    目前基于预训练-微调模式下的命名实体识别任务预训练与微调之间会出现差距,难以有效地对实体与上下文之间的关系进行建模,并且当前中文命名实体识别方法不能获取足够的字形或词义。针对上述问题,本文提出一种基于提示学习且融合多层级特征信息的命名实体识别方法。首先根据提示学习机制构建提示文本,再将输入文本的字符、词和实体级别特征信息与之拼接作为预训练模型的输入,以有效捕捉上下文之间的语义信息,缩小预训练模型与下游任务之间的差距,提高模型对命名实体识别的感知能力。本文提出的方法充分利用先验知识,提升模型的学习质量,提高在中文复杂多变语义环境下命名实体识别的效果。在人民日报、MSRA、Weibo、Resume和CMeEE数据集上的F1值分别达到了97.09%、96.68%、83.44%、97.48%和76.05%。实验结果表明,本文提出方法总体优于目前主流的中文命名实体识别方法。

    Abstract:

    The current named entity recognition task based on the pre-training-fine-tuning model has a gap between pre-training and fine-tuning, which makes it difficult to effectively model the relationship between entities and contexts, and the current Chinese named entity recognition methods cannot obtain sufficient character or word meanings. To address above problems, this paper proposes a named entity recognition method based on cue learning and incorporating multi-level feature information. Firstly, the cue text is constructed based on the cue learning mechanism, and then the character, word and entity-level feature information of the input text is spliced with it, which is taken as the input of the pre-trained model to effectively capture the semantic information between the contexts, narrow the gap between the pre-trained model and the downstream task, and improve the perceptive ability of the model for named entity recognition. The proposed method makes full use of prior knowledge to increase the learning ability of the model and improve the effectiveness of named entity recognition in the complex and variable semantic environment of Chinese. The F1 values reach 97.09%, 96.68%, 83.44%, 97.48% and 76.05% on the People’s Daily, MSRA, Weibo, Resume and CMeEE datasets, respectively. Experimental results show that the proposed method is generally better than the current mainstream Chinese named entity recognition methods.

    参考文献
    相似文献
    引证文献
引用本文

王昕,魏楚元,张蕾,万珊珊.提示学习框架下融合多层级特征信息的中文命名实体识别[J].数据采集与处理,2024,(4):1020-1032

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2023-03-19
  • 最后修改日期:2023-08-10
  • 录用日期:
  • 在线发布日期: 2024-07-25