Speech Emotion Recognition with Multi-task Learning
CSTR:
Author:
Affiliation:

1.School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China;2.Research & Development Group, iFLYTEK Co. Ltd., Hefei 230088, China

Clc Number:

TP183

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    In recent speech emotion recognition, researchers attempt to identify emotion from speech signals using deep learning models. However, traditional single-task learning-based models do not pay enough attention to speech acoustic emotional information, resulting in low accuracy of emotion recognition. In view of this, this paper proposes a multi-task learning, end-to-end speech emotion recognition network to mine acoustic emotion in speech and improve the accuracy of emotion recognition. In order to avoid the loss of information caused by using frequency domain features, this paper adopts the Wav2vec2.0 as the backbone network of the model to extract the acoustic and semantic features of speech, and the attention mechanism is used to integrate the two kinds of features as self-supervised features. To make full use of the acoustic sentiment information in speech, using emotion-related phoneme recognition as an auxiliary task, a multi-task learning model is used to mine acoustic sentiment in self-supervised features. Experimental results on the public dataset IEMOCAP show that, the proposed multi-task learning model achieves a weighted accuracy rate of 76.0% and an unweighted accuracy rate of 76.9%, with significantly improved model performance compared to the traditional single-task learning model. Meanwhile, ablation experiments verify the effectiveness of auxiliary task and self-supervised network fine-tuning strategy.

    Reference
    Related
    Cited by
Get Citation

LI Yunfeng, YAN Zulong, GAO Tian, FANG Xin, ZOU Liang. Speech Emotion Recognition with Multi-task Learning[J].,2024,39(2):424-432.

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 11,2022
  • Revised:April 25,2023
  • Adopted:
  • Online: March 25,2024
  • Published:
Article QR Code