Split vector quantization for sinusoidal amplitude and frequency

doi:10.1631/jzus.C1000020

Front. Inform. Technol. Electron. Eng.

2011, Vol. 12

Issue (2): 140-154 DOI: 10.1631/jzus.C1000020

Split vector quantization for sinusoidal amplitude and frequency

Pejman Mowlaee, Abolghasem Sayadian, Hamid Sheikhzadeh

Department of Electronic Engineering, Amirkabir University of Technology, Tehran 15875-4413, Iran

Split vector quantization for sinusoidal amplitude and frequency

Pejman Mowlaee, Abolghasem Sayadian, Hamid Sheikhzadeh

Department of Electronic Engineering, Amirkabir University of Technology, Tehran 15875-4413, Iran

全文: PDF(549 KB)

摘要： In this paper, we suggest applying tree structure on the sinusoidal parameters. The suggested sinusoidal coder is targeted to find the coded sinusoidal parameters obtained by minimizing a likelihood function in a least square (LS) sense. From a rate-distortion standpoint, we address the problem of how to allocate available bits among different frequency bands to code sinusoids at each frame. For further analyzing the quantization behavior of the proposed method, we assess the quantization performance with respect to other methods: the short-time Fourier transform (STFT) based coder commonly used for speech enhancement or separation, and the line spectral frequency (LSF) coder used in speech coding. Through extensive simulations, we show that the proposed quantizer leads to less spectral distortion as well as higher perceived quality for the re-synthesized signals based on the coded parameters in a model-based approach with respect to previous STFT-based methods. The proposed method lowers the complexity, and, due to its tree-structure, leads to a rapid search capability. It provides flexibility for use in many speaker-independent applications by finding the most likely frequency vectors selected from a list of frequency candidates. Therefore, the proposed quantizer can be considered an attractive candidate for model-based speech applications in both speaker-dependent and speaker-independent scenarios.

关键词： Short-time Fourier transform; Split vector quantization; Sinusoidal modeling; Spectral distortion

Abstract: In this paper, we suggest applying tree structure on the sinusoidal parameters. The suggested sinusoidal coder is targeted to find the coded sinusoidal parameters obtained by minimizing a likelihood function in a least square (LS) sense. From a rate-distortion standpoint, we address the problem of how to allocate available bits among different frequency bands to code sinusoids at each frame. For further analyzing the quantization behavior of the proposed method, we assess the quantization performance with respect to other methods: the short-time Fourier transform (STFT) based coder commonly used for speech enhancement or separation, and the line spectral frequency (LSF) coder used in speech coding. Through extensive simulations, we show that the proposed quantizer leads to less spectral distortion as well as higher perceived quality for the re-synthesized signals based on the coded parameters in a model-based approach with respect to previous STFT-based methods. The proposed method lowers the complexity, and, due to its tree-structure, leads to a rapid search capability. It provides flexibility for use in many speaker-independent applications by finding the most likely frequency vectors selected from a list of frequency candidates. Therefore, the proposed quantizer can be considered an attractive candidate for model-based speech applications in both speaker-dependent and speaker-independent scenarios.

Key words: Short-time Fourier transform Split vector quantization Sinusoidal modeling Spectral distortion

收稿日期: 2010-01-28 出版日期: 2011-02-08

CLC:

TN912

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	Pejman Mowlaee
	Abolghasem Sayadian
	Hamid Sheikhzadeh

引用本文:

Pejman Mowlaee, Abolghasem Sayadian, Hamid Sheikhzadeh. Split vector quantization for sinusoidal amplitude and frequency. Front. Inform. Technol. Electron. Eng., 2011, 12(2): 140-154.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/10.1631/jzus.C1000020 或 http://www.zjujournals.com/xueshu/fitee/CN/Y2011/V12/I2/140

[1]	Pejman MOWLAEE, Abolghasem SAYADIYAN, Hamid SHEIKHZADEH. Evaluating single-channel speech separation performance in transform-domain[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(3): 160-174.

Viewed

Full text

Abstract

Cited

Shared

Discussed