Communication technology |
|
|
|
|
Prediction of emotional dimensions PAD for emotional speech recognition |
Ying SUN(),Yan-xiang HU,Xue-ying ZHANG*(),Shu-fei DUAN |
College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China |
|
|
Abstract The continuous emotional dimension PAD (pleasure, arousal, dominance) was proposed to introduce into emotion recognition in view of the fact that the existing emotional characteristics only analyze emotion from the point of view of signal, and can not directly reflect the emotional state. The experimental samples were based on three emotions (sadness, anger and happiness) from the TYUT2.0 database and the Berlin voice library, and the emotional features (prosodic feature, formant, MFCC and nonlinear feature) were extracted. Grey relational analysis (GRA) was used to select the main features that affect P, A and D in order to obtain the objective and accurate PAD dimension values. Then principal component analysis (PCA) was used to extract the principal components of the main features, and was made as the input of least squares support vector machine (LSSVM) to predict the P, A and D. The emotional features, PAD dimensions and their fusion were used separately for emotion recognition by using support vector machine. The experimental results show that the prediction method improves the prediction accuracy of the P, A and D to a certain extent. The predictive values can effectively identify the emotion, which has a certain complement to emotional characteristics in emotion recognition.
|
Received: 22 August 2018
Published: 30 September 2019
|
|
Corresponding Authors:
Xue-ying ZHANG
E-mail: tyutsy@163.com;tyzhangxy@163.com
|
面向情感语音识别的情感维度PAD预测
针对现有的情感特征仅从信号的角度对情感进行分析,不能直观反映情感状态的问题,提出将连续情感维度PAD引入情感识别. 实验样本选用TYUT2.0数据库和柏林语音库中的3种情感(悲伤、愤怒和高兴),提取情感特征(韵律特征、共振峰、MFCC和非线性特征). 为了获取客观、精确的PAD维度,利用灰色关联分析(GRA)选取影响P、A、D的主要特征,通过主成分分析(PCA)提取主要特征的主成分,将主成分作为最小二乘支持向量机(LSSVM)的输入预测P、A、D. 分别对情感特征、PAD维度及它们的融合,采用支持向量机进行情感识别. 实验结果表明,该预测方法在一定程度上提高了对P、A、D的预测精度,预测值可以有效识别情感,对情感特征在情感识别方面有一定的补充作用.
关键词:
语音情感识别,
PAD维度,
最小二乘支持向量机(LSSVM),
灰色关联分析(GRA),
主成分分析(PCA)
|
|
[1] |
蒋海华, 胡斌 基于PCA和SVM的普通话语音情感识别[J]. 计算机科学, 2015, 42 (11): 270- 273 JIANG Hai-hua, HU Bin Speech emotion recognition in mandarin based on PCA and SVM[J]. Computer Science, 2015, 42 (11): 270- 273
|
|
|
[2] |
谭发曾. 语音情感状态模糊识别研究[D]. 成都: 电子科技大学, 2015. TAN Fa-zeng. Study of speech motion states fuzzy recognition [D]. Chengdu: University of Electronic Science and Technology of China, 2015.
|
|
|
[3] |
ZBANCIOC M D, FERARU M. Using the Lyapunov exponent from cepstral coefficients for automatic emotion recognition [C] // International Conference and Exposition on Electrical and Power Engineering. Iasi, Romania: IEEE, 2014: 110-113.
|
|
|
[4] |
孙颖, 宋春晓 相空间重构的情感语音特征提取及优化[J]. 西安电子科技大学学报: 自然科学版, 2017, 44 (6): 162- 168 SUN Ying, SONG Chun-xiao Emotional speech feature extraction and optimization of phase space reconstruction[J]. Journal of Xidian University: Natural Science, 2017, 44 (6): 162- 168
|
|
|
[5] |
MEHRABIAN A Pleasure-arousal-dominance: a general framework for describing and measuring individual differences in temperament[J]. Current Psychology, 1996, 14 (4): 261- 292
doi: 10.1007/BF02686918
|
|
|
[6] |
VERMA G K, TIWARY U S Affect representation and recognition in 3D continuous valence–arousal–dominance space[J]. Multimedia Tools and Applications, 2016, 76 (2): 1- 25
|
|
|
[7] |
SUYKENS J A K, VANDEWALLE J Least squares support machine classifiers[J]. Neural Processing Letters, 1999, 9 (3): 293- 300
doi: 10.1023/A:1018628609742
|
|
|
[8] |
SUN W, SUN J Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm[J]. Journal of Environmental Management, 2016, 188: 144- 152
|
|
|
[9] |
CAI Z, XU W, MENG Y, et al Prediction of landslide displacement based on GA-LSSVM with multiple factors[J]. Bulletin of Engineering Geology and the Environment, 2016, 75 (2): 637- 646
doi: 10.1007/s10064-015-0804-z
|
|
|
[10] |
梁宁, 耿立艳, 张占福, 等 基于GRA与SVM-mixed的货运量预测方法[J]. 交通运输系统工程与信息, 2016, 16 (6): 94- 99 LIANG Ning, GENG Li-yan, ZHANG Zhan-fu, et al A prediction method of railway freight volumes using GRA and SVM-mixed[J]. Journal of Transportation Systems Engineering and Information Technology, 2016, 16 (6): 94- 99
doi: 10.3969/j.issn.1009-6744.2016.06.015
|
|
|
[11] |
王沛, 欧阳传湘, 陈宏生, 等 应用PCA和多元非线性回归快速预测储层敏感性[J]. 断块油气田, 2018, 25 (2): 232- 235 WANG Pei, OUYANG Chuan-xiang, CHEN Hong-sheng, et al Application of PCA and multiple nonlinear regression to rapid prediction of reservoir sensitivity[J]. Fault-Block Oil and Gas Field, 2018, 25 (2): 232- 235
|
|
|
[12] |
王丽. V-A空间连续维度情感预测方法研究[D]. 镇江: 江苏大学, 2015. WANG Li. Research on dimensional and continuous emotion prediction in valence-arousal space [D]. Zhenjiang: Jiangsu University, 2015.
|
|
|
[13] |
汪建新, 陈肖洁 LSSVM的特征选择算法在烧结过程的应用[J]. 机械设计与制造, 2018, (3): 75- 77 WANG Jian-xin, CHEN Xiao-jie Application in sintering process modeling using the feature selection algorithm of least squares support vector machine[J]. Machinery Design and Manufacture, 2018, (3): 75- 77
doi: 10.3969/j.issn.1001-3997.2018.03.023
|
|
|
[14] |
张雪英, 张婷, 孙颖, 等 情感语音数据库优化及PAD情感模型量化标注[J]. 太原理工大学学报, 2017, 48 (3): 469- 474 ZHANG Xue-ying, ZHANG Ting, SUN Ying, et al Emotional speech database optimization and quantitative annotation based on PAD emotion model[J]. Journal of Taiyuan University of Technology, 2017, 48 (3): 469- 474
|
|
|
[15] |
BURKHARDT F, PAESCHKE A, ROLFES M, et al. A database of German emotional speech [C] // European Conference on Speech Communication and Technology. Lisbon, Portugal: DBLP, 2005: 1517-1520.
|
|
|
[16] |
姚慧, 孙颖, 张雪英 情感语音的非线性动力学特征[J]. 西安电子科技大学学报: 自然科学版, 2016, 43 (5): 167- 172 YAO Hui, SUN Ying, ZHANG Xue-ying Research on nonlinear dynamics features of emotional speech[J]. Journal of Xidian University: Natural Science, 2016, 43 (5): 167- 172
doi: 10.3969/j.issn.1001-2400.2016.05.029
|
|
|
[17] |
李幼军, 钟宁, 黄佳进, 等 基于高斯核函数支持向量机的脑电信号时频特征情感多类识别[J]. 北京工业大学学报, 2018, 44 (2): 234- 243 LI You-jun, ZHONG Ning, HUANG Jia-jin, et al Human emotion multi-classification recognition based on the EEG time and frequency features by using a Gaussian kernel function SVM[J]. Journal of Beijing University of Technology, 2018, 44 (2): 234- 243
doi: 10.11936/bjutxb2017040018
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|