融合多特征和表情情感词典的性别对立言论识别方法
作者:
作者单位:

1.安徽理工大学计算机科学与工程学院,淮南232001;2.合肥综合性国家科学中心人工智能研究院,合肥 232088

作者简介:

通讯作者:

基金项目:

国家自然科学基金面上项目(62076006);安徽省高校协同创新项目(GXXT-2021-008)。


Gender Opposition Speech Recognition Method of Fusing Multi-feature and Emoji Sentiment Lexicon
Author:
Affiliation:

1.School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, China;2.Institute of Artificial Intelligence Research, Hefei Comprehensive National Science Center, Hefei 232088, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    为识别相关极端言论,提出了一种融合多特征和表情情感词典的性别对立言论识别方法。首先,使用BERT(Bidirectional encoder representation from transformer)提取输入文本的字符特征,并使用Word2Vec提取输入文本中五笔、郑码以及拼音3个方面的特征;然后,将这4个方面的特征进行融合,再输入到Bi-GRU(Bi-directional gated recurrent unit)网络中学习更深层次的语义信息;最后,通过全连接层加SoftMax函数计算出情感极性概率,并融合表情情感词典判别输入文本是否为性别对立言论。通过在自行收集的中文性别对立数据集上进行实验,与未加入特征和表情情感词典的方法相比,在F1值上有5.19%的提升。同时,在公开中文情感分析数据集Weibo_senti_100k上进行验证,证明了本方法的泛化性。

    Abstract:

    To identify relevant extreme speech, a gender opposition speech recognition method of fusing multi-features and emoji sentiment lexicon is proposed. Firstly, BERT(Bidirectional encoder representation from transformer) is used to extract the character features of the input texts, and Word2Vec is used to extract the Wubi, Zhengma and Pinyin features of the input texts. Then, these features are fused and fed into the Bi-GRU(Bi-directional gated recurrent unit) network to obtain the deeper semantic information. Finally, the sentiment polarities are calculated with the full-connected layer and SoftMax function combining the emoji sentiment lexicon to determine whether the input texts are related gender opposition. Compared with the method without adding multi-features and emoji sentiment lexicon, the experiments on the self-collected Chinese gender opposition dataset show that the proposed model is improved on the F1 value by 5.19%. In addition, the generalization of the proposed method is verified by experiments on the public Chinese sentiment analysis dataset Weibo_senti_100k.

    参考文献
    相似文献
    引证文献
引用本文

马子晨,张顺香,刘云朵,朱广丽.融合多特征和表情情感词典的性别对立言论识别方法[J].数据采集与处理,2024,(3):699-709

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2023-07-18
  • 最后修改日期:2023-10-10
  • 录用日期:
  • 在线发布日期: 2024-06-14