面向中文关系抽取的句子结构获取方法
作者:
作者单位:

1.贵州大学省部共建公共大数据国家重点实验室,贵阳 550025;2.贵州大学计算机科学与技术学院,贵阳 550025;3.贵州省智能医学影像分析与精准诊断重点实验室,贵阳 550025;4.贵州省智能人机交互工程技术研究中心,贵阳 550025

作者简介:

通讯作者:

基金项目:

国家自然科学基金通用联合基金重点(U1836205)资助项目;国家自然科学基金重大研究计划(91746116)资助项目;国家自然科学基金(62066007,62066008)资助项目;贵州省科技重大专项计划(黔科合重大专项字[2017]3002)资助项目;贵州省科学技术基金重点(黔科合基础[2020]1Z055)资助项目;贵州省教育厅青年科技人才成长项目(黔教合KY字[2017]137)资助项目;贵州省科技计划项目(黔科合基础[2018]1082)资助项目。


Sentence Structure Acquisition Method for Chinese Relation Extraction
Author:
Affiliation:

1.State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China;2.College of Computer Science and Technology, Guizhou University, Guiyang 550025, China;3.Key Laboratory of Intelligent Medical Image Analysis and Precision Diagnosis of Guizhou Province, Guiyang 550025, China;4.Guizhou Intelligent Human-Computer Interaction Engineering Technology Research Center, Guiyang 550025, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    在关系抽取中,神经网络模型是目前最常用的技术之一,然而现有神经网络模型很少考虑句子中两个实体之间的结构特征。该文针对关系抽取任务的特点,提出了基于神经网络模型的句子结构获取方法。该方法通过对关系实例中两个实体的位置进行特殊标记,使神经网络模型能够有效捕获句子中关于实体的结构信息。为了验证方法的有效性,分别采用两种主流的神经网络模型进行对比实验,实验结果表明,该方法在ACE 2005中文关系抽取数据集上的抽取性能得到显著提升,超出对比工作约11个百分点,表明该方法能有效提升关系抽取任务的性能。

    Abstract:

    Neural network model is one of the most commonly used techniques in relation extraction. However, the existing neural network models seldom consider the structural features between two entities in a sentence. Based on the characteristics of relation extraction task, this paper proposes a sentence structure acquisition method on neural network model. In this method, the positions of two entities in relation instance are marked so that the neural network model can effectively capturethe structural information about the entities in sentences. In order to verify the effectiveness of the proposed method, two mainstream neural network models are used for comparative experiments. Experiments show that the performance is improved significantly on ACE 2005 Chinese corpus. The result has exceeded the comparison work by approximately 11 percentage points. That proves that this method can significantly improve the performance of relation extraction.

    表 3 模型对比Table 3 Model comparison
    图1 实体标记及分隔方法Fig.1 Method of entity marking and separation
    图2 模型总图Fig.2 Model overview
    图3 实体标记的CNN模型结构图Fig.3 Architecture of entity-marked CNN model
    图4 实体标记的BERT模型结构图Fig.4 Architecture of entity-marked BERT model
    图5 ACE 2005中文数据集句子长度分布折线图Fig.5 Lengths of sentences in ACE 2005 Chinese dataset
    表 1 实验结果Table 1 Experimental results
    表 2 各大类实验性能Table 2 Experimental results on main relation types
    参考文献
    相似文献
    引证文献
引用本文

杨卫哲,秦永彬,黄瑞章,王凯,程华龄,唐瑞雪,程欣宇,陈艳平.面向中文关系抽取的句子结构获取方法[J].数据采集与处理,2021,36(3):605-620

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2020-01-16
  • 最后修改日期:2020-12-25
  • 录用日期:
  • 在线发布日期: 2021-06-16