Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2013, Vol. 14 Issue (7): 573-582    DOI: 10.1631/jzus.CIDE1310
    
Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features
Qi-rong Mao, Xiao-lei Zhao, Zheng-wei Huang, Yong-zhao Zhan
Department of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China
Download:   PDF(0KB)
Export: BibTeX | EndNote (RIS)      

Abstract  Functional paralanguage includes considerable emotion information, and it is insensitive to speaker changes. To improve the emotion recognition accuracy under the condition of speaker-independence, a fusion method combining the functional paralanguage features with the accompanying paralanguage features is proposed for the speaker-independent speech emotion recognition. Using this method, the functional paralanguages, such as laughter, cry, and sigh, are used to assist speech emotion recognition. The contributions of our work are threefold. First, one emotional speech database including six kinds of functional paralanguage and six typical emotions were recorded by our research group. Second, the functional paralanguage is put forward to recognize the speech emotions combined with the accompanying paralanguage features. Third, a fusion algorithm based on confidences and probabilities is proposed to combine the functional paralanguage features with the accompanying paralanguage features for speech emotion recognition. We evaluate the usefulness of the functional paralanguage features and the fusion algorithm in terms of precision, recall, and F1-measurement on the emotional speech database recorded by our research group. The overall recognition accuracy achieved for six emotions is over 67% in the speaker-independent condition using the functional paralanguage features.

Key wordsSpeech emotion recognition      Speaker-independent      Functional paralanguage      Fusion algorithm      Recognition accuracy     
Received: 29 December 2012      Published: 05 July 2013
CLC:  TP391.4  
Cite this article:

Qi-rong Mao, Xiao-lei Zhao, Zheng-wei Huang, Yong-zhao Zhan. Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features. Front. Inform. Technol. Electron. Eng., 2013, 14(7): 573-582.

URL:

http://www.zjujournals.com/xueshu/fitee/10.1631/jzus.CIDE1310     OR     http://www.zjujournals.com/xueshu/fitee/Y2013/V14/I7/573


Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features

Functional paralanguage includes considerable emotion information, and it is insensitive to speaker changes. To improve the emotion recognition accuracy under the condition of speaker-independence, a fusion method combining the functional paralanguage features with the accompanying paralanguage features is proposed for the speaker-independent speech emotion recognition. Using this method, the functional paralanguages, such as laughter, cry, and sigh, are used to assist speech emotion recognition. The contributions of our work are threefold. First, one emotional speech database including six kinds of functional paralanguage and six typical emotions were recorded by our research group. Second, the functional paralanguage is put forward to recognize the speech emotions combined with the accompanying paralanguage features. Third, a fusion algorithm based on confidences and probabilities is proposed to combine the functional paralanguage features with the accompanying paralanguage features for speech emotion recognition. We evaluate the usefulness of the functional paralanguage features and the fusion algorithm in terms of precision, recall, and F1-measurement on the emotional speech database recorded by our research group. The overall recognition accuracy achieved for six emotions is over 67% in the speaker-independent condition using the functional paralanguage features.

关键词: Speech emotion recognition,  Speaker-independent,  Functional paralanguage,  Fusion algorithm,  Recognition accuracy 
[1] Zheng-wei Huang, Wen-tao Xue, Qi-rong Mao. Speech emotion recognition with unsupervised feature learning[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(5): 358-366.
[2] Qi-rong Mao, Xin-yu Pan, Yong-zhao Zhan, Xiang-jun Shen. Using Kinect for real-time emotion recognition via facial expressions[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(4): 272-282.