Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2019, Vol. 53 Issue (3): 533-540    DOI: 10.3785/j.issn.1008-973X.2019.03.014
Computer Technology     
Multi-scale convolution target detection algorithm with feature pyramid
Zhi-jie LIN1,2(),Zhuang LUO2,Lei ZHAO2,*(),Dong-ming LU2
1. School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China
2. School of computer science, Zhejiang University, Hangzhou 310027, China
Download: HTML     PDF(1173KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A feature pyramid multi-scale network structure was constructed based on the region recommendation network, the small target and class-independent image target were detected by combining the full convolution operation. In order to improve the detection accuracy of small targets in images, a three-layer pyramid structure network based on side link fusion was constructed, which made full use of the convolution features of images with low semantic level. To improve the robustness of class-independent image target detection, a specific non-maximum suppression algorithm was proposed to eliminate redundant target windows in overlapping target filtering and to refine the location of the target windows. The experimental results on PASCAL VOC 2007, PASCAL VOC 2012 and ancient painting datasets show that the detection accuracy of the proposed algorithm for small targets, multi-scale targets and type-independent targets is higher than that of the existing algorithms.



Key wordsimage target detection      image feature pyramid      multi-scale full convolution      small targets detection      category-independent target detection     
Received: 27 April 2018      Published: 04 March 2019
CLC:  TP 391  
Corresponding Authors: Lei ZHAO     E-mail: bytelin@qq.com;cszhl@zju.edu.cn
Cite this article:

Zhi-jie LIN,Zhuang LUO,Lei ZHAO,Dong-ming LU. Multi-scale convolution target detection algorithm with feature pyramid. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 533-540.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2019.03.014     OR     http://www.zjujournals.com/eng/Y2019/V53/I3/533


特征金字塔多尺度全卷积目标检测算法

基于区域建议网络构建一种特征金字塔多尺度网络结构,并结合全卷积操作完成微小目标与类别无关目标的检测. 为了提升图像中微小目标的检测精度,构建基于侧链接融合的3层金字塔结构网络,充分利用语义级别比较低的图像卷积特征. 为了提高类别无关的图像目标检测鲁棒性,提出特定的非极大值抑制算法,在重叠目标过滤时消除冗余目标窗口,并对目标窗口进行位置精修. 在PASCAL VOC 2007、PASCAL VOC 2012以及古代绘画数据集上的实验结果表明:所提算法对于微小目标、多尺度目标检测及种类无关的目标检测的检测精度高于已有算法.


关键词: 图像目标检测,  图像特征金字塔,  多尺度全卷积,  微小目标检测,  类别无关目标检测 
Fig.1 Structure chart of multi-scale and full convolution target detection network
Fig.2 Flowchart of non-maximum suppression algorithm
Fig.3 Bounding box location refinement module
网络模型 数据集 Pave / %
多尺度全卷积网络 71.2
Faster-RCNN PASCAL VOC 2007 68.8
Mask-RCNN 69.36
多尺度全卷积网络 67.1
Faster-RCNN PASCAL VOC 2012 66.5
Mask-RCNN 66.9
多尺度全卷积网络 74.8
Faster-RCNN PASCAL VOC 2007 + 2012 72.5
Mask-RCNN 73.26
Tab.1 Comparisons of mean precision values of different network models on standard datasets
Fig.4 Recall rate variation line of Faster-RCNN and multi-scale full convolution network on standard dataset with intersection-over-union (IoU)
Fig.5 Results comparison between multi-scale full convolution network model and Faster-RCNN method in micro-target detection
Fig.6 Variation of recall rate with IoU on ancient painting image dataset
Fig.7 Results comparison between multi-scale full convolution network model and Fast-RCNN method in target detection on ancient painting image data sets
[1]   SERMANET P, EIGEN D, ZHANG X, et al. OverFeat: integrated recognition, localization and detection using convolutional networks [EB/OL]. preprint arXiv: 1312.6229.
[2]   REN S, HE K, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39 (6): 1137- 1149
[3]   PAPAGEORGIOU C P. A general framework for object detection [C] // Computer Vision and Pattern Recognition. Santa Barbara: IEEE, 1998: 511–562.
[4]   PAPAGEORGIOU C, POGGIO T A trainable system for object detection[J]. International Journal of Computer Vision, 2000, 38 (1): 15- 33
doi: 10.1023/A:1008162616689
[5]   VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features [C] // 2001 Proceedings of Computer Vision and Pattern Recognition. Kauai: IEEE, 2001: I-I.
[6]   LOWE D G Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60 (2): 91- 110
doi: 10.1023/B:VISI.0000029664.99615.94
[7]   DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C] // IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego: IEEE, 2005: 886–893.
[8]   KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C] // International Conference on Neural Information Processing Systems. Lake Tahoe: Springer, 2012: 1097–1105.
[9]   GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Puerto: IEEE, 2014: 580–587.
[10]   FELZENSZWALB P F, MCALLESTER D A, RAMANAN D. A discriminatively trained, multiscale, deformable part model [C] // Computer Vision and Pattern Recognition. Hausdorff: IEEE, 2008: 1–8.
[11]   FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D A, et al Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32 (9): 1627- 1645
doi: 10.1109/TPAMI.2009.167
[12]   GIRSHICK R B. Fast R-CNN [C] // International Conference on Computer Vision, Santiago: IEEE, 2015: 1440–1448.
[13]   HE K, ZHANG X, REN S, et al Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (9): 1904- 1916
[14]   LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 936–944.
[15]   HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN [C] // IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980–2988.
[16]   REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779–788.
[17]   SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. preprint arXiv: 1409.1556v6.
[18]   LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multiBox detector [C] // European Conference on Computer Vision. Amsterdam: Springer, 2016: 21–37.
[19]   RUSSAKOVSKY O, DENG J, SU H, et al ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115 (3): 211- 252
doi: 10.1007/s11263-015-0816-y
[20]   GHIASI G, FOWLKES C C. Laplacian pyramid reconstruction and refinement for semantic segmentation [C] // European Conference on Computer Vision. Amsterdam: Springer, 2016: 519–534.
[1] Shou-guo ZHENG,Yong-de ZHANG,Wen-tian XIE,Hu FAN,Qing WANG. Aircraft final assembly line modeling based on digital twin[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 843-854.
[2] Shi-lin ZHANG,Si-ming MA,Zi-qian GU. Large margin metric learning based vehicle re-identification method[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 948-956.
[3] Peng SONG,De-dong YANG,Chang LI,Chang GUO. An adaptive siamese network tracking algorithm based on global feature channel recognition[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 966-975.
[4] Jun CAI,Gang ZHAO,Yong YU,Qiang-wei BAO,Sheng DAI. A rapid reconstruction method of simulation model based on point cloud and design model[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 905-916.
[5] Hong-li WANG,Bin GUO,Si-cong LIU,Jia-qi LIU,Yun-gang WU,Zhi-wen YU. End context-adaptative deep sensing model with edge-end collaboration[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 626-638.
[6] Teng ZHANG,Xin-long JIANG,Yi-qiang CHEN,Qian CHEN,Tao-mian MI,Piu CHAN. Wrist attitude-based Parkinson's disease ON/OFF state assessment after medication[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 639-647.
[7] Ying-jie ZHENG,Song-rong WU,Ruo-yu WEI,Zhen-wei TU,Jin LIAO,Dong LIU. Metro location point matching and false alarm elimination based on FCM algorithm of target image[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(3): 586-593.
[8] Zi-ye YONG,Ji-chang GUO,Chong-yi LI. weakly supervised underwater image enhancement algorithm incorporating attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(3): 555-562.
[9] Yong YU,Jing-yuan XUE,Sheng DAI,Qiang-wei BAO,Gang ZHAO. Quality prediction and process parameter optimization method for machining parts[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(3): 441-447.
[10] Hui-ya HU,Shao-yan GAI,Fei-peng DA. Face frontalization based on generative adversarial network[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 116-123.
[11] Yang-bo CHEN,Guo-dong YI,Shu-you ZHANG. Surface warpage detection method based on point cloud feature comparison[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 81-88.
[12] You-kang DUAN,Xiao-gang CHEN,Jian GUI,Bin MA,Shun-fen LI,Zhi-tang SONG. Continuous kinematics prediction of lower limbs based on phase division[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 89-95.
[13] Tai-heng ZHANG,Biao MEI,Lei QIAO,Hao-jie YANG,Wei-dong ZHU. Detection method for composite hole guided by texture boundary[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(12): 2294-2300.
[14] Dong LIANG,Xin-yu LIU,Jia-xing PAN,Han SUN,Wen-jun ZHOU,Shun’ichi KANEKO. Foreground segmentation under dynamic background based on self-updating co-occurrence pixel[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(12): 2405-2413.
[15] Yao JIN,Wei ZHANG. Real-time fire detection algorithm with Anchor-Free network architecture[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(12): 2430-2436.