基于图卷积深浅特征融合的跨语料库情感识别
作者:
作者单位:

1.江苏师范大学物理与电子工程学院,徐州221116;2.江苏师范大学科文学院,徐州221116;3.江苏师范大学语言科学与艺术学院,徐州221116

作者简介:

通讯作者:

基金项目:

江苏省高校自然科学基金(18KJB510013)。


Deep and Shallow Feature Fusion Based on Graph Convolution for Cross-Corpus Emotion Recognition
Author:
Affiliation:

1.School of Physics and Electronic Engineering, Jiangsu Normal University, Xuzhou 221116, China;2.Kewen College, Jiangsu Normal University, Xuzhou 221116, China;3.School of Linguistics Sciences and Arts, Jiangsu Normal University, Xuzhou 221116, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    语音情感识别任务的训练数据和测试数据往往来源于不同的数据库,二者特征空间存在明显差异,导致识别率很低。针对该问题,本文提出新的构图方法表示源和目标数据库之间的拓扑结构,利用图卷积神经网络进行跨语料库的情感识别。针对单一情感特征识别率不高的问题,提出一种新的特征融合方法。首先利用 OpenSMILE提取浅层声学特征,然后利用图卷积神经网络提取深层特征。随着卷积层的不断深入,节点的特征信息被传递给其他节点,使得深层特征包含更明确的节点特征信息和更详细的语义信息,然后将浅层特征和深层特征进行特征融合。采用两组实验进行验证,第1组用eNTERFACE库训练测试Berlin库,识别率为59.4%;第2组用Berlin库训练测试eNTERFACE库,识别率为36.1%。实验结果高于基线系统和文献中最优的研究成果,证明本文提出方法的有效性。

    Abstract:

    The traning and testing data for speech emotion recognition often come from different corpora.In this case,the mode recognition performance decreases greatly due to the domain mismatch.To address this problem, we present a new composition method using graph convolutional network to represent the topological structure between the source and target databases for cross corpus speech emotion recognition. Besides,aiming at the problem of low accuracy of single feature in emotion recognition,a novel feature fusion method is proposed.Firstly, we extract the acoustic features by OpenSMILE, then extract deep features by graph convolutional neural network. With the proceeding of convolutional layers,nodes transmit the feature information to another nodes,making the deep features contain clearer feature information and more detailed semantic information. Finally, we fusion the shallow and deep features. Two classification experiments are carried out. eNTERFACE corpus is for training and Berlin corpus is for testing, and the recognition rate is 59.375%. Berlin corpus is for training and eNTERFACE corpus is for testing, and the recognition rate is 36.111%. The experimental results are higher than the best research results in the baseline system and references, which proves the effectiveness of the method proposed in this paper.

    参考文献
    相似文献
    引证文献
引用本文

杨子秀,金赟,马勇,戴妍妍,俞佳佳,顾煜.基于图卷积深浅特征融合的跨语料库情感识别[J].数据采集与处理,2023,38(1):111-120

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2022-01-07
  • 最后修改日期:2022-04-19
  • 录用日期:
  • 在线发布日期: 2023-01-25