计算机科学与人工智能 |
|
|
|
|
改进深度信念网络在语音转换中的应用 |
王文浩(),张筱,万永菁*() |
华东理工大学 信息科学与工程学院,上海 200237 |
|
Improved deep belief network and its application in voice conversion |
Wen-hao WANG(),Xiao ZHANG,Yong-jing WAN*() |
School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China |
1 |
ERRO D, ALONSO A, SERRANO L Interpretable parametric voice conversion functions based on Gaussian Mixture Models and constrained transformations[J]. Computer Speech and Language, 2014, 30 (1): 3- 15
|
2 |
DOI H, TODA T, NAKAMURA K, et al Alaryngeal speech enhancement based on one-to-many eigenvoice conversion[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014, 22 (1): 172- 183
doi: 10.1109/TASLP.2013.2286917
|
3 |
TODA T, NAKAGIRI M, SHIKANO K Statistical voice conversion techniques for body-conducted unvoiced speech enhancement[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20 (9): 2505- 2517
doi: 10.1109/TASL.2012.2205241
|
4 |
DENG L, ACERO A, JIANG L, et al. High-performance robust speech recognition using stereo training data [C] // IEEE International Conference on Acoustics, Speech, and Signal Processing. Las Vegas: IEEE, 2001: 301-304.
|
5 |
KUNIKOSHI A, QIAN L, MINEMATSU N, et al. Speech generation from hand gestures based on space mapping [C] // Tenth Annual Conference of the International Speech Communication Association. England: INTERSPEECH, 2009: 308-311.
|
6 |
MIZUNO H, ABE M Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectral tilt[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 1 (6): 469- 472
|
7 |
ABE M, NAKAMURA S, et al. Voice conversion through vector quantization [C] // IEEE International Conference on Acoustics, Speech, and Signal Processing. Las Vegas: IEEE, 1988: 71-76.
|
8 |
YAMAGISHI J, KOBAYASHI T, NAKANO Y, et al Analysis of speaker adaptation algorithm[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17 (1): 66- 83
doi: 10.1109/TASL.2008.2006647
|
9 |
SARUWATARI T H, SHIKANO K. Voice conversion algorithm based on Gaussian Mixture Model with dynamic frequency warping of STRAIGHT spectrum [C] // Proceedings of IEEE International Conference on Acoust, Speech, Signal Processing. Las Vegas: IEEE, 2001: 841-844.
|
10 |
沈惠玲, 万永菁 一种基于预测谱偏移的自适应高斯混合模型在语音转换中的应用[J]. 华东理工大学学报:工学版, 2017, 43 (4): 546- 552 SHEN Hui-ling, WAN Yong-jing An adaptive Gaussian Mixed Model based on predictive spectral shift and its application in voice conversion[J]. Journal of East China University of Science and Technology: Engineering Science, 2017, 43 (4): 546- 552
|
11 |
左国玉, 刘文举, 阮晓钢 基于径向基神经网络的声音转换[J]. 中文信息学, 2004, 18 (1): 78- 84 ZUO Guo-yu, LIU Wen-ju, RUAN Xiao-gang Voice conversion by GA-based RBF neural network[J]. Journal of Chinese Information Processing, 2004, 18 (1): 78- 84
doi: 10.3969/j.issn.1003-0077.2004.01.012
|
12 |
NARENDRANATH M, MURTHY H A, RAJENDRAN S, et al Transformation of formants for voice conversion using artificial neural networks[J]. Speech Communication, 1995, 16 (2): 207- 216
|
13 |
王民, 黄斐, 刘利, 等 采用深度信念网络的语音转换方法[J]. 计算机工程与应用, 2016, 52 (15): 168- 171 WANG Ming, HUANG Fei, LIU Li, et al Voice conversion using deep belief networks[J]. Computer Engineering and Applications, 2016, 52 (15): 168- 171
doi: 10.3778/j.issn.1002-8331.1409-0383
|
14 |
叶伟, 俞一彪. 超帧特征空间下基于深度置信网络的语音转换[D]. 苏州: 苏州大学, 2016. YE Wei, YU Yi-biao. Voice conversion using deep belief network in super frame feature space[D]. Soochow: Soochow University, 2016.
|
15 |
宋知用. Matlab在语音信号分析与合成中的应用: 第1版 [M]. 北京: 北京航空航天大学出版社, 2013: 2-16, 62-66, 161-162.
|
16 |
吕士楠, 初敏, 许洁萍, 等. 汉语语音合成: 原理和技术[M]. 北京: 科学出版社, 2012.
|
17 |
SMOLENSKY P. Information processing in dynamical systems: foundations of harmony theory [D]. Cambridge, MA, USA, 1986, 1(6): 194-281.
|
18 |
周志华. 机器学习[M]. 北京: 清华大学出版社, 2013: 111-115.
|
19 |
HINTON G Training products of experts by minimizing contrastive divergence[J]. Neural Computation, 2002, 12 (14): 1711- 1800
|
20 |
NAKASHIKA T, TAKASHIMA R, TAKIGUCH T, et al. Voice conversion in high-order eigen space using deep belief nets [C] // Interspeech. Lyon: INTERSPEECH, 2013: 369-372.
|
21 |
GHORBANDOOST M, SAYADIYAN A, AHANGAR M, et al. Voice conversion based on feature combination with limited training data[J]. Speech Communication, 2015, 67 (3): 115- 117
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|