Efficient multi-scale pedestrian detection algorithm withfeature map aggregation

doi:10.3785/j.issn.1008-973X.2019.06.022

Journal of ZheJiang University (Engineering Science)

2019, Vol. 53

Issue (6): 1218-1224 DOI: 10.3785/j.issn.1008-973X.2019.06.022

Computer and Aut omation Technology

Efficient multi-scale pedestrian detection algorithm withfeature map aggregation

Yun CHEN(

),Xiao-dong CAI*(

),Xiao-xi LIANG,Meng WANG

School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China

Download:

HTML

PDF(944KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

An efficient multi-scale pedestrian detection algorithm with convolutional neural network feature map aggregation was proposed for the problems of low accuracy and efficiency in pedestrian detection algorithm trained by manual design feature. An aggregation network was designed to gather high-level and low-level feature maps to construct a feature map with the ability of spatial resolution and semantic. And an extension network was constructed to provide feature maps for multi-scale detection. In addition, candidate areas were redesigned to construct a multi-scale detection network to improve positioning accuracy. The feature map aggregation network, extension network and multi-scale pedestrian detection network were combined for an end-to-end training. The experimental results show that, compared to algorithms based on manual design features, the proposed algorithm can effectively improve the accuracy of pedestrian detection and positioning. Under common hardware conditions, the proposed approach can provide real-time detection.

Key words： feature map aggregation pedestrian detection multi-scale spatial resolution semantic ability

Received: 09 May 2018 Published: 22 May 2019

CLC:

TP 391

Corresponding Authors: Xiao-dong CAI E-mail: 1655770801@qq.com;caixiaodong@guet.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Yun CHEN
	Xiao-dong CAI
	Xiao-xi LIANG
	Meng WANG

Cite this article:

Yun CHEN,Xiao-dong CAI,Xiao-xi LIANG,Meng WANG. Efficient multi-scale pedestrian detection algorithm withfeature map aggregation. Journal of ZheJiang University (Engineering Science), 2019, 53(6): 1218-1224.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2019.06.022 OR http://www.zjujournals.com/eng/Y2019/V53/I6/1218

特征图聚集多尺度行人检测高效算法

针对使用人工设计特征训练的行人检测算法准确率和效率较低的问题，提出一种采用卷积神经网络特征图聚集多尺度行人检测高效算法. 设计一种特征图聚集网络，将高层次特征图与低层次特征图进行聚集，构造出有较好空间分辨和语义能力的特征图；构造特征延伸网络，提供用于多尺度行人检测的特征图；重新设计目标候选区域，构造多尺度行人检测网络，提升定位准确性，并将特征图聚集网络、特征延伸网络和多尺度行人检测网络组合进行端到端训练. 实验测试结果表明，该算法可以有效提高行人检测与定位准确性，并可在普通硬件设备条件下提供实时检测.

关键词： 特征图聚集, 行人检测, 多尺度, 空间分辨, 语义能力

Fig.1 Low-level and high-level feature maps aggregation structure based on visual geometry group (VGG) network

Fig.2 Feature extension network structure based on multi-scale convolution

Fig.3 Pedestrian detection network structure based on candidate region classification and border regression of multi-scale feature maps

Fig.4 Examples for different scene images from INRIA pedestrian dataset

Fig.5 Examples for different scene images from ETH pedestrian dataset

Fig.6 Curve of missing detection rate change with average false detection rate per image on INRIA pedestrian dataset

Tab.1 Accuracy of detection in different methods on INRIA pedestrian datasets

Fig.7 Curve of missing detection rate change with average false detection rate per image on ETH pedestrian dataset

Tab.2 Location accuracy of proposed method and SSD method on INRIA pedestrian datasets

Tab.3 Comparison of number of detected pictures per second in different methods


[1]	许言午, 曹先彬, 乔红行人检测系统研究新进展及关键技术展望[J]. 电子学报, 2008, 36 (5): 962- 968 XU Yan-wu, CAO Xian-bin, QIAO hong Survey on the latest development of pedestrian detection system and its key technologies expectation[J]. Acta Electronica Sinica, 2008, 36 (5): 962- 968 doi: 10.3321/j.issn:0372-2112.2008.05.023

[2]	王素玉, 沈兰荪智能视觉监控技术研究进展[J]. 中国图象图形学报, 2008, 36 (5): 962- 968 WANG Su-yu, SHEN Lan-sun intelligent visual surveillance technology: a survey[J]. Journal of Image and Graphics, 2008, 36 (5): 962- 968

[3]	乔传标, 王素玉, 卓力, 等智能视觉监控中的目标检测与跟踪技术[J]. 测控技术, 2008, 27 (5): 22- 24 QIAO Chuan-biao, WANG Su-yu, ZHUO Li, et al Object detection and tracking for intelligent video surveillance[J]. Measurement and Control Technology, 2008, 27 (5): 22- 24 doi: 10.3969/j.issn.1000-8829.2008.05.007

[4]	PAPAGEORGIOU C, POGGIO T A trainable system for object detection[J]. International Journal of Computer Vision, 2000, 38 (1): 15- 33 doi: 10.1023/A:1008162616689

[5]	VIOLA P, JONES M J Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57 (2): 137- 154 doi: 10.1023/B:VISI.0000013087.49260.fb

[6]	VIOLA P, JONES M J, SNOW D Detecting pedestrians using patterns of motion and appearance[J]. International Journal of Computer Vision, 2005, 63 (2): 153- 161 doi: 10.1007/s11263-005-6644-8

[7]	VIOLA P, JONES M J, SNOW D. Detecting pedestrians using patterns of motion and appearance [C] // 9th IEEE International Conference on Computer Vision. Nice: IEEE, 2003: 734-741.

[8]	DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C] // Computer Vision and Pattern Recognition. San Diego: IEEE, 2005: 886-893.

[9]	WANG X Y, HAN T X, YAN S. An HOG-LBP human detector with partial occlusion handling [C] // 2009 IEEE 12th International Conference on Computer Vision. Kyoto: IEEE, 2009: 32-39.

[10]	FREUND Y, SCHAPIRE R E. Experiments with a new boosting algorithm [C] // International Conference on Machine Learning. Bari: ICML, 1996: 148-156.

[11]	GIRSHICK R B. Fast R-CNN [C] // IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.

[12]	GIRSHICK R B, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.

[13]	UIJLINGS J R R, VANDESANDE K E A, GEVERS T, et al Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104 (2): 154- 171 doi: 10.1007/s11263-013-0620-5

[14]	REN S Q, HE K M, GIRSHICK R B, et al. Faster R-CNN: towards real-time object detection with region proposal networks [C] // Neural Information Processing Systems. Montreal: NIPS, 2015: 91-99.

[15]	LIU W, ANGUELOV D, ERHAN D. SSD: single shot multibox detector [C] // European Conference on Computer Vision. Amsterdam: ECCV, 2016: 21-37.

[16]	JOSEPH R, ALI F. YOLO9000: better, faster, stronger [C] // IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525.

[17]	SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.

[18]	SZEGEDY C, IOFFE S, VANHOUCKE V. Inception-v4, Inception-ResNet and the impact of residual connections on learning [C] // Association for the Advancement of Artificial Intelligence. San Francisco: AAAI, 2017: 4278-4284

[19]	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE 2016: 2818-2826.

[20]	AHMED E, JONES M and MARKS T K. An improved deep learning architecture for person re-identification [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3908-3916.

[21]	KONG T, YAO A B, CHEN Y R, et al. HyperNet: towards accurate region proposal generation and joint object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 845-853.

[22]	IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C] // International Conference on Machine Learning. Lille: ICML, 2015: 448-456.

[23]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.

[24]	ESS A, LEIBE B, VAN GOOL L. Depth and appearance for mobile scene analysis [C] // IEEE International Conference on Computer Vision. Rio de Janeiro: IEEE, 2007: 1-8.

[25]	NAM W, DOLLAR P, HAN J H. Local decorrelation for improved pedestrian detection [C] // Neural Information Processing Systems. Montreal: NIPS, 2014: 424-432.

[26]	LIM J, ZITNICK C, DOLLAR P. Sketch tokens: a learned mid-level representation for contour and object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 3158-3165.

[27]	BENENSON R, MATHIAS M, TUYTELAARS T. Seeking the strongest rigid detector [C] // IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 3666-3673.

[28]	MARIN J, VAZQUEZ D, LOPEZ A, et al. Random forests of local experts for pedestrian detection [C] // IEEE International Conference on Computer Vision. Sydney: IEEE, 2013: 2592-2599.

[1]	Wen-tao YU,Xu XIE,Cheng CHENG. Effects of welding details on ultra-low cycle fatigue performance of T-welded joint[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 31-37.

[2]	Qiao-hong CHEN,YI CHEN,Wen-shu Li,Yu-bo JIA. Clothing image classification based on multi-scale SE-Xception[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1727-1735.

[3]	Tao MING,Dan WANG,Ji-chang GUO,Qiang LI. Breast cancer histopathological image classification using multi-scale channel squeeze-and-excitation model[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1289-1297.

[4]	Qing-pan KONG,Xian-bing JI,Ru-hong ZHOU,Tian-ya YOU,Jin-liang XU. Enhancement of steam condensation heat transfer on hydrophilic-hydrophobic two-layer structure surface[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(5): 1022-1028.

[5]	Shu-fang ZHANG,Tong ZHU. Traffic sign detection and recognition based on residual single shot multibox detector model[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(5): 940-949.

[6]	Zhi-jie LIN,Zhuang LUO,Lei ZHAO,Dong-ming LU. Multi-scale convolution target detection algorithm with feature pyramid[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 533-540.

[7]	LIU Da-long, FENG Dong-qin. Deceptive attack detection of control system using multi-scale principal component analysis[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(9): 1738-1746.

[8]	FANG Zhao, LI Ai-qun, LI Wan-run, SHEN Sheng. Verification on multi-scale finite element of wind-induced fatigue of steel structures[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(6): 1131-1139.

[9]	NIU Hui, WANG Jin-feng, ZHANG Yi-ping, ZHANG Zhi-cheng, YU Ya-nan. Study of incremental launching of space-curved butterfly-arch bridge[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(7): 1205-1212.

[10]	CUI He-liang, ZHANG Dan, SHI Bin. Spatial resolution and its calibration method for Brillouin scattering based distributed sensors[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(7): 1232-1237.

[11]	PENG Hai, ZHAO Ju-feng, FENG Hua-jun, XU Zhi-hai, LI Qi, CHEN Yue-ting. Dual band image fusion method based on region saliency[J]. Journal of ZheJiang University (Engineering Science), 2012, 46(11): 2109-2115.

[12]	HUA Gang, TAO Yu-bo, LIN Hai. Robust multi-scale seismic horizon detection and visualization[J]. Journal of ZheJiang University (Engineering Science), 2011, 45(10): 1697-1703.

[13]	XU Ya-qin, ZHAI Guo-qing, HUANG Xuan-xuan, ZHU Bu-quan, SHEN Hang-feng. Wind retrieval by using multi-scale dynamic region based on TREC method[J]. Journal of ZheJiang University (Engineering Science), 2011, 45(10): 1738-1745.

[14]	LIN Ban, ZHANG Shu-Wei. Multi-scale coupling method of product fuzzy configuration and structural variant design[J]. Journal of ZheJiang University (Engineering Science), 2010, 44(5): 841-848.

[15]	QIAO Hua, CHEN Wei-qiu. Multi-scale numerical simulation of structures based on Arlequin method[J]. Journal of ZheJiang University (Engineering Science), 2010, 44(12): 2314-2319.

Viewed

Full text

Abstract

Cited

Shared

Discussed