基于短语成分表示的中文关系抽取
作者:
作者单位:

1.复旦大学计算机科学技术学院, 上海, 201203;2.上海智能电子与系统研究院, 上海, 201203;3.上海秘塔网络科技有限公司, 上海, 200135;4.微软亚洲研究院, 北京, 100080

作者简介:

通讯作者:

基金项目:

国家自然科学基金(61702107)资助项目; 赛尔网络下一代互联网技术创新(NGII20180611)资助项目。


Chinese Relation Extraction Based on Constituency Representation
Author:
Affiliation:

1.School of Computer Science, Fudan University, Shanghai, 201203, China;2.Shanghai Institute of Intelligent Electronics & Systems, Shanghai, 201203, China;3.META SOTA, Shanghai, 200135, China;4.Microsoft Research, Beijing, 100080, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    关系抽取是自然语言处理的重要研究内容,短语成分结构则是学界普遍认为能对关系抽取有重要影响的特征信息。然而目前短语成分应用于关系抽取任务时没有明显效果。这主要有两个原因:短语成分分析模型的泛化能力较差,会在关系抽取上造成错误传播,从而影响了它对关系抽取的有效性;关系抽取任务上使用短语成分特征的方式存在缺陷,即丧失短语成分分析学习到的句子结构信息,或者加大其对关系抽取的错误影响。本文在提升短语成分分析效果的基础上,提出了基于短语成分表示的中文关系抽取方法。该方法将短语成分分析模型学习到的文本表示嵌入到关系抽取模型中,从而提升关系抽取的性能。本文在公开的中文关系抽取数据集上验证了该方法的有效性。

    Abstract:

    Relation extraction is an important research in the natural language processing (NLP) area. The constituency grammar information, which is widely believed by the academic community, has an important influence on relation extraction. However, there is no obvious effect when the phrase syntactic tree is applied to the relation extraction task. There are two main reasons for this: First, the generalization ability of the constituency parser is poor, which will cause error propagation and then affect its effectiveness in the relation extraction; Second, there are flaws in the way of the use of the phrase syntactic features in the relation extraction task,that is the phrase syntactic structure information learned by the constituency parser is lost, or the wrong influence on the relation extraction is increased. This paper proposes a Chinese relation extraction method based on constituency vector representation to solve the above two problems. The method embeds the text representation learned by the constituency parser into the relation extraction model, thereby improving the relation extraction performance. This paper validates the method on a public Chinese relation extraction data set.

    参考文献
    相似文献
    引证文献
引用本文

刘娜娜,程婧,闵可锐,康昱,王新,周扬帆.基于短语成分表示的中文关系抽取[J].数据采集与处理,2020,35(3):449-457

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2019-08-20
  • 最后修改日期:2019-12-09
  • 录用日期:
  • 在线发布日期: 2020-05-25