School of Software, Dalian Neusoft University of Information, Dalian 116023, China
Clc Number:
TP183
Fund Project:
Article
|
Figures
|
Metrics
|
Reference
|
Related
|
Cited by
|
Materials
|
Comments
Abstract:
Speech emotion recognition (SER) is the key point for computer to understand human emotion, and it is also important in human-computer interaction. When the emotional speech signal transforms in the different media, the recognition accuracy of traditional deep learning model is not high enough, and the migration ability is not strong. Here, an acoustic wave equation emotion recognition model, i.e., image saliency gated recurrent acoustic wave equation emotion recognition (ISGR-AWEER) model is designed. The model is composed of image saliency extraction and gated recurrent model. The first part simulates the attention mechanism, which is used to extract the salient regions in speech. An acoustic wave equation emotion recognition model is designed. The model simulates the recurrent neural network, which can effectively improve the accuracy of SER in cross-media, and can quickly realize the model migration in cross-media. The effectiveness of the current model is verified by the experiments on the interactive emotional dynamic motion capture emotional corpus and the self-built multi-media emotional speech corpus. Compared with recurrent neural network, the accuracy of emotion recognition is improved by 25%, and it has a strong ability of cross-media migration.
Reference
Related
Cited by
Get Citation
JIA Ning, ZHENG Chunjun. An Acoustic Wave Equation Emotion Recognition Model Based on Image Saliency[J].,2021,36(5):1062-1072.