Understanding visual-auditory correlation from heterogeneous features for cross-media retrieval

doi:10.1631/jzus.A071191

Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering)

2008, Vol. 9

Issue (2): 241-249 DOI: 10.1631/jzus.A071191

Electrical & Electronic Engineering

Understanding visual-auditory correlation from heterogeneous features for cross-media retrieval

Hong ZHANG, Yan-yun WANG, Hong PAN, Fei WU

College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430081, China; School of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China; School of Elementary Education, Hangzhou Normal University, Hangzhou 310036, China; School of Information Engineering, Hangzhou Normal University, Hangzhou 310036, China

Download:

PDF (0 KB)
Export: BibTeX | EndNote (RIS)

Abstract Cross-media retrieval is an interesting research topic, which seeks to remove the barriers among different modalities. To enable cross-media retrieval, it is needed to find the correlation measures between heterogeneous low-level features and to judge the semantic similarity. This paper presents a novel approach to learn cross-media correlation between visual features and auditory features for image-audio retrieval. A semi-supervised correlation preserving mapping (SSCPM) method is described to construct the isomorphic SSCPM subspace where canonical correlations between the original visual and auditory features are further preserved. Subspace optimization algorithm is proposed to improve the local image cluster and audio cluster quality in an interactive way. A unique relevance feedback strategy is developed to update the knowledge of cross-media correlation by learning from user behaviors, so retrieval performance is enhanced in a progressive manner. Experimental results show that the performance of our approach is effective.

Key words： Heterogeneity Cross-media retrieval Subspace optimization Dynamic correlation update

Received: 11 April 2007 Published: 10 January 2008

CLC:	TP37
	TP391

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Hong ZHANG
	Yan-yun WANG
	Hong PAN
	Fei WU

Cite this article:

Hong ZHANG, Yan-yun WANG, Hong PAN, Fei WU. Understanding visual-auditory correlation from heterogeneous features for cross-media retrieval. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2008, 9(2): 241-249.

URL:

http://www.zjujournals.com/xueshu/zjus-a/10.1631/jzus.A071191 OR http://www.zjujournals.com/xueshu/zjus-a/Y2008/V9/I2/241

[1]	ZOU Fu-tai, WU Zeng-de, ZHANG Liang, MA Fan-yuan. Control DHT maintenance costs with session heterogeneity[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2005, 6( 5): 5-.

Viewed

Full text

Abstract

Cited

Shared

Discussed