Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2012, Vol. 13 Issue (2): 139-145    DOI: 10.1631/jzus.C1100092
    
Detection of time varying pitch in tonal languages: an approach based on ensemble empirical mode decomposition
Hong Hong, Xiao-hua Zhu, Wei-min Su, Run-tong Geng, Xin-long Wang
School of Electronic Engineering and Optoelectronic Techniques, Nanjing University of Science and Technology, Nanjing 210094, China; State Key Laboratory of Modern Acoustics, Institute of Acoustics, Nanjing University, Nanjing 210093, China
Detection of time varying pitch in tonal languages: an approach based on ensemble empirical mode decomposition
Hong Hong, Xiao-hua Zhu, Wei-min Su, Run-tong Geng, Xin-long Wang
School of Electronic Engineering and Optoelectronic Techniques, Nanjing University of Science and Technology, Nanjing 210094, China; State Key Laboratory of Modern Acoustics, Institute of Acoustics, Nanjing University, Nanjing 210093, China
 全文: PDF 
摘要: A method based on ensemble empirical mode decomposition (EEMD) is proposed for accurately detecting the time varying pitch of speech in tonal languages. Unlike frame-, event-, or subspace-based pitch detectors, the time varying information of pitch within the short duration, which is of crucial importance in speech processing of tonal languages, can be accurately extracted. The Chinese Linguistic Data Consortium (CLDC) database for Mandarin Chinese was employed as standard speech data for the evaluation of the effectiveness of the method. It is shown that the proposed method provides more accurate and reliable results, particularly in estimating the tones of non-monotonically varying pitches like the third one in Mandarin Chinese. Also, it is shown that the new method has strong resistance to noise disturbance.
关键词: Ensemble empirical mode decompositionTime varying pitchTonal languageNoise restraint    
Abstract: A method based on ensemble empirical mode decomposition (EEMD) is proposed for accurately detecting the time varying pitch of speech in tonal languages. Unlike frame-, event-, or subspace-based pitch detectors, the time varying information of pitch within the short duration, which is of crucial importance in speech processing of tonal languages, can be accurately extracted. The Chinese Linguistic Data Consortium (CLDC) database for Mandarin Chinese was employed as standard speech data for the evaluation of the effectiveness of the method. It is shown that the proposed method provides more accurate and reliable results, particularly in estimating the tones of non-monotonically varying pitches like the third one in Mandarin Chinese. Also, it is shown that the new method has strong resistance to noise disturbance.
Key words: Ensemble empirical mode decomposition    Time varying pitch    Tonal language    Noise restraint
收稿日期: 2011-04-13 出版日期: 2012-01-19
CLC:  TN912.3  
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
Hong Hong
Xiao-hua Zhu
Wei-min Su
Run-tong Geng
Xin-long Wang

引用本文:

Hong Hong, Xiao-hua Zhu, Wei-min Su, Run-tong Geng, Xin-long Wang. Detection of time varying pitch in tonal languages: an approach based on ensemble empirical mode decomposition. Front. Inform. Technol. Electron. Eng., 2012, 13(2): 139-145.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/10.1631/jzus.C1100092        http://www.zjujournals.com/xueshu/fitee/CN/Y2012/V13/I2/139

[1] Li-chun Yang, Yun-tao Qian. 基于稀疏编码的广义旁瓣抵消器语音增强算法[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(12): 1154-1163.
[2] Junhong Zhao, Ji Xu, Wei-qiang Zhang, Hua Yuan, Jia Liu, Shanhong Xia. Exploiting articulatory features for pitch accent detection[J]. Front. Inform. Technol. Electron. Eng., 2013, 14(11): 835-844.
[3] Myoungbeom Chung, Ilju Ko. An algorithm that minimizes audio fingerprints using the difference of Gaussians[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(10): 836-845.
[4] Myoung-beom CHUNG, Il-ju KO. Identical-video retrieval using the low-peak feature of a video’s audio information[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(3): 151-159.
[5] Pejman MOWLAEE, Abolghasem SAYADIYAN, Hamid SHEIKHZADEH. Evaluating single-channel speech separation performance in transform-domain[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(3): 160-174.