基于句法和全文信息增强的中文事件检测方法
作者:
作者单位:

中国民航大学计算机科学与技术学院, 天津 300300

作者简介:

通讯作者:

基金项目:

国家自然科学基金(U1633110);空中交通管理系统与技术国家重点实验室开放基金 (SKLATM201902)。


Chinese Event Detection with Syntax and Full Text Information Enhancement
Author:
Affiliation:

School of Computer Science & Technology, Civil Aviation University of China, Tianjin 300300, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    针对目前中文事件检测中词语之间句法依存关系利用不充分和缺乏文章全局语义信息的问题,提出了一种基于句法和全文信息增强的中文事件检测模型。模型首先引入图卷积网络 (Graph convolutional network, GCN),通过捕获词语之间的依存句法关系来增强词语的特征表示。之后采用双向门控循环单元(Bidirectional gate recurrent unit, Bi-GRU)分别学习句子内和句子之间的上下文信息,得到包含文章全局信息的句向量。最后将字、词、句3个粒度的信息通过门结构进行动态融合,使用条件随机场(Conditional random field, CRF)完成对句子中触发词的识别和标注。在ACE2005和CEC中文数据集上的实验结果表明,本文方法有效提升了中文事件检测的效果。

    Abstract:

    Aiming at the problems of insufficient utilization of syntactic dependencies between words and lack of global semantic information in Chinese event detection, a Chinese event detection model based on syntactic and full-text information enhancement is proposed. Firstly, the model introduces graph convolutional network (GCN) to enhance the feature representation of words by capturing the dependency syntactic relationship between words. Then, bidirectional gate recurrent unit (Bi-GRU) is used to learn the context information within and between sentences respectively, and the sentence vector containing the global information of the article is obtained. Finally, the information of word, phrase and sentence is dynamically fused through the gate structure, and the conditional random field (CRF) is used to identify and label the trigger words in the sentence. Experimental results on ACE2005 and CEC Chinese data sets show that the proposed method effectively improves the effect of Chinese event detection.

    参考文献
    相似文献
    引证文献
引用本文

王红,吴浩正.基于句法和全文信息增强的中文事件检测方法[J].数据采集与处理,2022,37(5):1059-1069

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2021-08-01
  • 最后修改日期:2022-01-16
  • 录用日期:
  • 在线发布日期: 2022-10-12