融合抽象层级变换和卷积神经网络的手绘图像检索方法

doi:10.3785/j.issn.1008-9497.2016.06.005

浙江大学学报（理学版）

2016, Vol. 43

Issue (6): 657-663 DOI: 10.3785/j.issn.1008-9497.2016.06.005

Chinagraph 2016——数字图像处理

融合抽象层级变换和卷积神经网络的手绘图像检索方法

刘玉杰¹, 庞芸萍¹, 李宗民¹, 李华²

1. 中国石油大学计算机与通信工程学院, 山东青岛 266580;
2. 中国科学院计算技术研究所智能信息处理重点实验室, 北京 100190

Sketch based image retrieval based on abstract-level transform and convolutional neural networks

LIU Yujie¹, PANG Yunping¹, LI Zongmin¹, LI Hua²

1. College of Computer & Communication Engineering, China University of Petroleum, Qingdao 266580, Shandong Province, China;
2. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology Chinese Academy of Sciences, Beijing 100190, China

全文: PDF(1601 KB)

摘要： 针对人工设计的描述子（HOG、SIFT等）在基于手绘的图像检索（Sketch Based Image Retrieval，SBIR）领域的局限性，提出了一种融合抽象层级变换和卷积神经网络构建联合深度特征描述子的手绘图像检索方法.首先，提取常规图像的边缘概率图，在此基础上进行不同抽象层级的图像变换，将抽象层级变换图像输入到深度神经网络并提取不同隐层的输出向量，最后，联合不同隐层的输出向量作为手绘图像检索的特征描述子（即联合深度特征描述子）.在Flickr15k数据库上对本方法进行了实验验证，结果表明：融合抽象层级变换和联合深度特征描述子的检索效果相较HOG、SIFT等传统方法有显著提高.本方法从图像预处理和特征描述子构建2个方面，对SBIR问题进行了改进，具有更高的准确率.

关键词： 手绘检索; 卷积神经网络; 边缘概率检测; 抽象层级变换; 联合深度特征

Abstract: The traditional methods on sketch based image retrieval(SBIR) is mainly based on the hand-crafted descriptors such as HOG and SIFT. Considering the limitations of the traditional hand-crafted descriptors, we propose a novel approach based on the abstract-level transform and the convolutional neural network(CNN). Our work is realized by the following steps: 1) Extracting the boundary probability images from the database images; 2) Converting the boundary probability images into abstract-level images; 3) Inputting the abstract-level images into the networks and extracting the hidden layers' output vectors; 4) Combining different hidden layers' output vectors as the final descriptor for retrieval. We evaluate our proposed retrieval strategy on Flickr15K datasets. The main contributions of our work are the preprocessing based on the boundary probability detector and the abstract-level transform ation, furthermore, proposing an improved combination of deep features. Results show that the proposal achieves significant improvements.

Key words: sketch based image retrieval convolutional neural network boundary probability detector abstract-level transform joint deep features

收稿日期: 2016-07-20 出版日期: 2017-03-07

CLC:

TP391.41

基金资助: 国家自然科学基金资助项目(61379106)；山东省自然科学基金资助项目(ZR2013FM036,ZR2015FM011)；浙江大学CAD&CG重点实验室开放基金(A1315).

通讯作者: 李宗民,ORCID:http://orcid:org/0000-0001-7006-055X,E-mail:lizongmin@upc.edu.cn E-mail: lizongmin@upc.edu.cn

作者简介: 刘玉杰(1971-),ORCID:http://orcid.org/0000-0002-1838-874X,男,副教授,博士,主要从事计算机图形图像处理、多媒体数据分析、多媒体数据库研究.

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	刘玉杰
	庞芸萍
	李宗民
	李华

引用本文:

刘玉杰, 庞芸萍, 李宗民, 李华. 融合抽象层级变换和卷积神经网络的手绘图像检索方法[J]. 浙江大学学报（理学版）, 2016, 43(6): 657-663.

LIU Yujie, PANG Yunping, LI Zongmin, LI Hua. Sketch based image retrieval based on abstract-level transform and convolutional neural networks. Journal of ZheJIang University(Science Edition), 2016, 43(6): 657-663.

链接本文:

https://www.zjujournals.com/sci/CN/10.3785/j.issn.1008-9497.2016.06.005 或 https://www.zjujournals.com/sci/CN/Y2016/V43/I6/657

[1] CHEN T, CHENG M M, TAN P, et al. Sketch2 Photo:Internet image montage[J]. ACM Transactions on Graphics,2009,28(5):89-97.
[2] HU R, WANG T, COLLOMOSSE J. A bag-of-regions approach to sketch-based image retrieval[C]//18th IEEE International Conference on Image Processing. Brussels:IEEE,2011:3661-3664.
[3] SAAVEDRA J M, BUSTOS B. An Improved Histogram of Edge Local Orientations for Sketch-Based Image Retrieval[M]//Pattern Recognition. Berlin:Springer, 2010:432-441.
[4] ZHONG Z, MIN L Z. Unsupervised domain adaption dictionary learning for visual recognition[J]. Computer Science, 2015. arxiv:1506.01125.
[5] LIM J J, ZITNICK C L, DOLLÁR P. Sketch tokens:A learned mid-level representation for contour and object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland:IEEE Computer Society Press, 2013:3158-3165.
[6] MARTIN D, FOWLKES C, MALIK J. Learning to detect natural image boundaries using local brightness, color, and texture cues[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2004,26(5):530-549.
[7] YU Q, YANG Y, SONG Y Z, et al. Sketch-a-net that beats humans[C]//Proceedings of the British Machine Vision Conference. Wales:British Machine Vision Association, 2015.
[8] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004,60(60):91-110.
[9] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Los Alamitos:IEEE Computer Society Press,2005(1):886-893.
[10] PARK D K, JEON Y S, WON C S. Efficient use of local edge histogram descriptor[C]//Proceedings of the 2000 ACM Workshops on Multimedia. New York:ACM, 2000:51-54.
[11] EITZ M, HILDEBRAND K, BOUBEKEUR T, et al. Sketch-based image retrieval:Benchmark and bag-of-features descriptors[J]. IEEE Transactions on Visualization and Computer Graphics, 2011,17(11):1624-1636.
[12] BELONGIE S J, MALIK J, PUZICHA J. Shape matching and object recognition using shape contexts[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2010,24(4):509-522.
[13] HU R, COLLOMOSSE J. A performance evaluation of gradient field HOG descriptor for sketch based image retrieval[J]. Computer Vision & Image Understanding, 2013,117(7):790-806.
[14] EITZ M, HAYS J, ALEXA M. How do humans sketch objects?[J]. ACM Transactions on Graphics, 2012,31(4):Article No.44.
[15] KUANG Z, LI Z, JIANG X, et al. Retrieval of non-rigid 3D shapes from multiple aspects[J]. Computer-Aided Design,2015,58:13-23.
[16] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012:1097-1105.
[17] DENG J, DONG W, SOCHER R, et al. Imagenet:A large-scale hierarchical image database[C]//IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos:IEEE Computer Society Press, 2009:248-255.
[18] JIA Y, SHELHAMER E, DONAHUE J, et al. Caffe:Convolutional architecture for fast feature embedding[C]//Proceedings of the ACM International Conference on Multimedia. New York:ACM Press,2014:675-678.
[19] WANG Y, XIE Z, XU K, et al. An efficient and effective convolutional auto-encoder extreme learning machine network for 3d feature learning[J]. Neurocomputing, 2016,174:988-998.

[1]	王协, 章孝灿, 苏程. 基于多尺度学习与深度卷积神经网络的遥感图像土地利用分类[J]. 浙江大学学报（理学版）, 2020, 47(6): 715-723.
[2]	陈善雄, 王小龙, 韩旭, 刘云, 王明贵. 一种基于深度学习的古彝文识别方法[J]. 浙江大学学报（理学版）, 2019, 46(3): 261-269.
[3]	郑锐, 钱文华, 徐丹, 普园媛. 基于卷积神经网络的刺绣风格数字合成[J]. 浙江大学学报（理学版）, 2019, 46(3): 270-278.
[4]	胡伟俭, 陈为, 冯浩哲, 张天平, 朱正茂, 潘巧明. 应用于平扫CT图像肺结节检测的深度学习方法综述[J]. 浙江大学学报（理学版）, 2017, 44(4): 379-384.

Viewed

Full text

Abstract

Cited

Shared

Discussed