Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features

doi:10.1631/jzus.CIDE1310

Front. Inform. Technol. Electron. Eng.

2013, Vol. 14

Issue (7): 573-582 DOI: 10.1631/jzus.CIDE1310

Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features

Qi-rong Mao, Xiao-lei Zhao, Zheng-wei Huang, Yong-zhao Zhan

Department of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China

Download:

PDF(0KB)
Export: BibTeX | EndNote (RIS)

Abstract Functional paralanguage includes considerable emotion information, and it is insensitive to speaker changes. To improve the emotion recognition accuracy under the condition of speaker-independence, a fusion method combining the functional paralanguage features with the accompanying paralanguage features is proposed for the speaker-independent speech emotion recognition. Using this method, the functional paralanguages, such as laughter, cry, and sigh, are used to assist speech emotion recognition. The contributions of our work are threefold. First, one emotional speech database including six kinds of functional paralanguage and six typical emotions were recorded by our research group. Second, the functional paralanguage is put forward to recognize the speech emotions combined with the accompanying paralanguage features. Third, a fusion algorithm based on confidences and probabilities is proposed to combine the functional paralanguage features with the accompanying paralanguage features for speech emotion recognition. We evaluate the usefulness of the functional paralanguage features and the fusion algorithm in terms of precision, recall, and F1-measurement on the emotional speech database recorded by our research group. The overall recognition accuracy achieved for six emotions is over 67% in the speaker-independent condition using the functional paralanguage features.

Key words： Speech emotion recognition Speaker-independent Functional paralanguage Fusion algorithm Recognition accuracy

Received: 29 December 2012 Published: 05 July 2013

CLC:

TP391.4

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Qi-rong Mao
	Xiao-lei Zhao
	Zheng-wei Huang
	Yong-zhao Zhan

Cite this article:

Qi-rong Mao, Xiao-lei Zhao, Zheng-wei Huang, Yong-zhao Zhan. Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features. Front. Inform. Technol. Electron. Eng., 2013, 14(7): 573-582.

URL:

http://www.zjujournals.com/xueshu/fitee/10.1631/jzus.CIDE1310 OR http://www.zjujournals.com/xueshu/fitee/Y2013/V14/I7/573

Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features

Functional paralanguage includes considerable emotion information, and it is insensitive to speaker changes. To improve the emotion recognition accuracy under the condition of speaker-independence, a fusion method combining the functional paralanguage features with the accompanying paralanguage features is proposed for the speaker-independent speech emotion recognition. Using this method, the functional paralanguages, such as laughter, cry, and sigh, are used to assist speech emotion recognition. The contributions of our work are threefold. First, one emotional speech database including six kinds of functional paralanguage and six typical emotions were recorded by our research group. Second, the functional paralanguage is put forward to recognize the speech emotions combined with the accompanying paralanguage features. Third, a fusion algorithm based on confidences and probabilities is proposed to combine the functional paralanguage features with the accompanying paralanguage features for speech emotion recognition. We evaluate the usefulness of the functional paralanguage features and the fusion algorithm in terms of precision, recall, and F1-measurement on the emotional speech database recorded by our research group. The overall recognition accuracy achieved for six emotions is over 67% in the speaker-independent condition using the functional paralanguage features.

关键词： Speech emotion recognition, Speaker-independent, Functional paralanguage, Fusion algorithm, Recognition accuracy

[1]	Zheng-wei Huang, Wen-tao Xue, Qi-rong Mao. Speech emotion recognition with unsupervised feature learning[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(5): 358-366.

[2]	Qi-rong Mao, Xin-yu Pan, Yong-zhao Zhan, Xiang-jun Shen. Using Kinect for real-time emotion recognition via facial expressions[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(4): 272-282.

Viewed

Full text

Abstract

Cited

Shared

Discussed