Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2019, Vol. 53 Issue (6): 1218-1224    DOI: 10.3785/j.issn.1008-973X.2019.06.022
Computer and Aut omation Technology     
Efficient multi-scale pedestrian detection algorithm withfeature map aggregation
Yun CHEN(),Xiao-dong CAI*(),Xiao-xi LIANG,Meng WANG
School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
Download: HTML     PDF(944KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

An efficient multi-scale pedestrian detection algorithm with convolutional neural network feature map aggregation was proposed for the problems of low accuracy and efficiency in pedestrian detection algorithm trained by manual design feature. An aggregation network was designed to gather high-level and low-level feature maps to construct a feature map with the ability of spatial resolution and semantic. And an extension network was constructed to provide feature maps for multi-scale detection. In addition, candidate areas were redesigned to construct a multi-scale detection network to improve positioning accuracy. The feature map aggregation network, extension network and multi-scale pedestrian detection network were combined for an end-to-end training. The experimental results show that, compared to algorithms based on manual design features, the proposed algorithm can effectively improve the accuracy of pedestrian detection and positioning. Under common hardware conditions, the proposed approach can provide real-time detection.



Key wordsfeature map aggregation      pedestrian detection      multi-scale      spatial resolution      semantic ability     
Received: 09 May 2018      Published: 22 May 2019
CLC:  TP 391  
Corresponding Authors: Xiao-dong CAI     E-mail: 1655770801@qq.com;caixiaodong@guet.edu.cn
Cite this article:

Yun CHEN,Xiao-dong CAI,Xiao-xi LIANG,Meng WANG. Efficient multi-scale pedestrian detection algorithm withfeature map aggregation. Journal of ZheJiang University (Engineering Science), 2019, 53(6): 1218-1224.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2019.06.022     OR     http://www.zjujournals.com/eng/Y2019/V53/I6/1218


特征图聚集多尺度行人检测高效算法

针对使用人工设计特征训练的行人检测算法准确率和效率较低的问题,提出一种采用卷积神经网络特征图聚集多尺度行人检测高效算法. 设计一种特征图聚集网络,将高层次特征图与低层次特征图进行聚集,构造出有较好空间分辨和语义能力的特征图;构造特征延伸网络,提供用于多尺度行人检测的特征图;重新设计目标候选区域,构造多尺度行人检测网络,提升定位准确性,并将特征图聚集网络、特征延伸网络和多尺度行人检测网络组合进行端到端训练. 实验测试结果表明,该算法可以有效提高行人检测与定位准确性,并可在普通硬件设备条件下提供实时检测.


关键词: 特征图聚集,  行人检测,  多尺度,  空间分辨,  语义能力 
Fig.1 Low-level and high-level feature maps aggregation structure based on visual geometry group (VGG) network
Fig.2 Feature extension network structure based on multi-scale convolution
Fig.3 Pedestrian detection network structure based on candidate region classification and border regression of multi-scale feature maps
Fig.4 Examples for different scene images from INRIA pedestrian dataset
Fig.5 Examples for different scene images from ETH pedestrian dataset
Fig.6 Curve of missing detection rate change with average false detection rate per image on INRIA pedestrian dataset
算法 Mr1/% 算法 Mr1/%
LDCF[25] 13.8 SAF R-CNN[29] 8.0
SketchTokens[26] 13.3 YOLOv2[16] 13.0
Roerei[27] 13.5 SSD 14.9
RandForest[28] 15.4 本文算法 12.1
Tab.1 Accuracy of detection in different methods on INRIA pedestrian datasets
Fig.7 Curve of missing detection rate change with average false detection rate per image on ETH pedestrian dataset
J/% P/% Q/% J/% P/% Q/%
50 94.9 95.5 80 53.6 55.0
60 91.3 92.3 90 13.7 14.9
70 78.6 81.1 ? ? ?
Tab.2 Location accuracy of proposed method and SSD method on INRIA pedestrian datasets
算法 Nper-sec 算法 Nper-sec
LDCF[25] 1.7 SAF R-CNN[29] 1.7
SketchTokens[26] 1.0 YOLOv2[16] 67
Roerei[27] 1.0 SSD 35
RandForest[28] 4.6 本文算法 33
Tab.3 Comparison of number of detected pictures per second in different methods
[1]   许言午, 曹先彬, 乔红 行人检测系统研究新进展及关键技术展望[J]. 电子学报, 2008, 36 (5): 962- 968
XU Yan-wu, CAO Xian-bin, QIAO hong Survey on the latest development of pedestrian detection system and its key technologies expectation[J]. Acta Electronica Sinica, 2008, 36 (5): 962- 968
doi: 10.3321/j.issn:0372-2112.2008.05.023
[2]   王素玉, 沈兰荪 智能视觉监控技术研究进展[J]. 中国图象图形学报, 2008, 36 (5): 962- 968
WANG Su-yu, SHEN Lan-sun intelligent visual surveillance technology: a survey[J]. Journal of Image and Graphics, 2008, 36 (5): 962- 968
[3]   乔传标, 王素玉, 卓力, 等 智能视觉监控中的目标检测与跟踪技术[J]. 测控技术, 2008, 27 (5): 22- 24
QIAO Chuan-biao, WANG Su-yu, ZHUO Li, et al Object detection and tracking for intelligent video surveillance[J]. Measurement and Control Technology, 2008, 27 (5): 22- 24
doi: 10.3969/j.issn.1000-8829.2008.05.007
[4]   PAPAGEORGIOU C, POGGIO T A trainable system for object detection[J]. International Journal of Computer Vision, 2000, 38 (1): 15- 33
doi: 10.1023/A:1008162616689
[5]   VIOLA P, JONES M J Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57 (2): 137- 154
doi: 10.1023/B:VISI.0000013087.49260.fb
[6]   VIOLA P, JONES M J, SNOW D Detecting pedestrians using patterns of motion and appearance[J]. International Journal of Computer Vision, 2005, 63 (2): 153- 161
doi: 10.1007/s11263-005-6644-8
[7]   VIOLA P, JONES M J, SNOW D. Detecting pedestrians using patterns of motion and appearance [C] // 9th IEEE International Conference on Computer Vision. Nice: IEEE, 2003: 734-741.
[8]   DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C] // Computer Vision and Pattern Recognition. San Diego: IEEE, 2005: 886-893.
[9]   WANG X Y, HAN T X, YAN S. An HOG-LBP human detector with partial occlusion handling [C] // 2009 IEEE 12th International Conference on Computer Vision. Kyoto: IEEE, 2009: 32-39.
[10]   FREUND Y, SCHAPIRE R E. Experiments with a new boosting algorithm [C] // International Conference on Machine Learning. Bari: ICML, 1996: 148-156.
[11]   GIRSHICK R B. Fast R-CNN [C] // IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
[12]   GIRSHICK R B, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[13]   UIJLINGS J R R, VANDESANDE K E A, GEVERS T, et al Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104 (2): 154- 171
doi: 10.1007/s11263-013-0620-5
[14]   REN S Q, HE K M, GIRSHICK R B, et al. Faster R-CNN: towards real-time object detection with region proposal networks [C] // Neural Information Processing Systems. Montreal: NIPS, 2015: 91-99.
[15]   LIU W, ANGUELOV D, ERHAN D. SSD: single shot multibox detector [C] // European Conference on Computer Vision. Amsterdam: ECCV, 2016: 21-37.
[16]   JOSEPH R, ALI F. YOLO9000: better, faster, stronger [C] // IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525.
[17]   SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
[18]   SZEGEDY C, IOFFE S, VANHOUCKE V. Inception-v4, Inception-ResNet and the impact of residual connections on learning [C] // Association for the Advancement of Artificial Intelligence. San Francisco: AAAI, 2017: 4278-4284
[19]   SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE 2016: 2818-2826.
[20]   AHMED E, JONES M and MARKS T K. An improved deep learning architecture for person re-identification [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3908-3916.
[21]   KONG T, YAO A B, CHEN Y R, et al. HyperNet: towards accurate region proposal generation and joint object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 845-853.
[22]   IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C] // International Conference on Machine Learning. Lille: ICML, 2015: 448-456.
[23]   LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
[24]   ESS A, LEIBE B, VAN GOOL L. Depth and appearance for mobile scene analysis [C] // IEEE International Conference on Computer Vision. Rio de Janeiro: IEEE, 2007: 1-8.
[25]   NAM W, DOLLAR P, HAN J H. Local decorrelation for improved pedestrian detection [C] // Neural Information Processing Systems. Montreal: NIPS, 2014: 424-432.
[26]   LIM J, ZITNICK C, DOLLAR P. Sketch tokens: a learned mid-level representation for contour and object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 3158-3165.
[27]   BENENSON R, MATHIAS M, TUYTELAARS T. Seeking the strongest rigid detector [C] // IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 3666-3673.
[28]   MARIN J, VAZQUEZ D, LOPEZ A, et al. Random forests of local experts for pedestrian detection [C] // IEEE International Conference on Computer Vision. Sydney: IEEE, 2013: 2592-2599.
[1] Wen-tao YU,Xu XIE,Cheng CHENG. Effects of welding details on ultra-low cycle fatigue performance of T-welded joint[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 31-37.
[2] Qiao-hong CHEN,YI CHEN,Wen-shu Li,Yu-bo JIA. Clothing image classification based on multi-scale SE-Xception[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1727-1735.
[3] Tao MING,Dan WANG,Ji-chang GUO,Qiang LI. Breast cancer histopathological image classification using multi-scale channel squeeze-and-excitation model[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1289-1297.
[4] Qing-pan KONG,Xian-bing JI,Ru-hong ZHOU,Tian-ya YOU,Jin-liang XU. Enhancement of steam condensation heat transfer on hydrophilic-hydrophobic two-layer structure surface[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(5): 1022-1028.
[5] Shu-fang ZHANG,Tong ZHU. Traffic sign detection and recognition based on residual single shot multibox detector model[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(5): 940-949.
[6] Zhi-jie LIN,Zhuang LUO,Lei ZHAO,Dong-ming LU. Multi-scale convolution target detection algorithm with feature pyramid[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 533-540.
[7] LIU Da-long, FENG Dong-qin. Deceptive attack detection of control system using multi-scale principal component analysis[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(9): 1738-1746.
[8] FANG Zhao, LI Ai-qun, LI Wan-run, SHEN Sheng. Verification on multi-scale finite element of wind-induced fatigue of steel structures[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(6): 1131-1139.
[9] NIU Hui, WANG Jin-feng, ZHANG Yi-ping, ZHANG Zhi-cheng, YU Ya-nan. Study of incremental launching of space-curved butterfly-arch bridge[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(7): 1205-1212.
[10] CUI He-liang, ZHANG Dan, SHI Bin. Spatial resolution and its calibration method for Brillouin scattering based distributed sensors[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(7): 1232-1237.
[11] PENG Hai, ZHAO Ju-feng, FENG Hua-jun, XU Zhi-hai, LI Qi, CHEN Yue-ting. Dual band image fusion method based on region saliency[J]. Journal of ZheJiang University (Engineering Science), 2012, 46(11): 2109-2115.
[12] HUA Gang, TAO Yu-bo, LIN Hai. Robust multi-scale seismic horizon detection and visualization[J]. Journal of ZheJiang University (Engineering Science), 2011, 45(10): 1697-1703.
[13] XU Ya-qin, ZHAI Guo-qing, HUANG Xuan-xuan, ZHU Bu-quan, SHEN Hang-feng. Wind retrieval by using multi-scale dynamic region based
on TREC method
[J]. Journal of ZheJiang University (Engineering Science), 2011, 45(10): 1738-1745.
[14] LIN Ban, ZHANG Shu-Wei. Multi-scale coupling method of product fuzzy configuration and structural variant design[J]. Journal of ZheJiang University (Engineering Science), 2010, 44(5): 841-848.
[15] QIAO Hua, CHEN Wei-qiu. Multi-scale numerical simulation of structures
based on Arlequin method
[J]. Journal of ZheJiang University (Engineering Science), 2010, 44(12): 2314-2319.