Abstract:The relation between the emotion dimension space and speech features is studied in this paper, and the automatic speech emotion recognition problem is addressed. Dimensional space model is introduced for basic emotions. Speech emotion features are extracted according to the arousal dimension and the valence dimension, statistic features are used to reduce the influence on emotional features due to the text variations. Anger, happiness, sadness and the neutral state is studied. Gaussian mixture model is adopted for modeling and recognizing the four emotion classes, the Gaussian mixture number is optimized in the experiment for good approximation of the probability distribution in the feature space. The experimental results show that, the features used in this paper are suitable for recognizing the basic emotions. The Gaussian mixture model achieves satisfactory classification results, and the importance of the valence features in the two dimensional space is presented in the recognition experiments.