Computer and Aut omation Technology |
|
|
|
|
Efficient multi-scale pedestrian detection algorithm withfeature map aggregation |
Yun CHEN( ),Xiao-dong CAI*( ),Xiao-xi LIANG,Meng WANG |
School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China |
|
|
Abstract An efficient multi-scale pedestrian detection algorithm with convolutional neural network feature map aggregation was proposed for the problems of low accuracy and efficiency in pedestrian detection algorithm trained by manual design feature. An aggregation network was designed to gather high-level and low-level feature maps to construct a feature map with the ability of spatial resolution and semantic. And an extension network was constructed to provide feature maps for multi-scale detection. In addition, candidate areas were redesigned to construct a multi-scale detection network to improve positioning accuracy. The feature map aggregation network, extension network and multi-scale pedestrian detection network were combined for an end-to-end training. The experimental results show that, compared to algorithms based on manual design features, the proposed algorithm can effectively improve the accuracy of pedestrian detection and positioning. Under common hardware conditions, the proposed approach can provide real-time detection.
|
Received: 09 May 2018
Published: 22 May 2019
|
|
Corresponding Authors:
Xiao-dong CAI
E-mail: 1655770801@qq.com;caixiaodong@guet.edu.cn
|
特征图聚集多尺度行人检测高效算法
针对使用人工设计特征训练的行人检测算法准确率和效率较低的问题,提出一种采用卷积神经网络特征图聚集多尺度行人检测高效算法. 设计一种特征图聚集网络,将高层次特征图与低层次特征图进行聚集,构造出有较好空间分辨和语义能力的特征图;构造特征延伸网络,提供用于多尺度行人检测的特征图;重新设计目标候选区域,构造多尺度行人检测网络,提升定位准确性,并将特征图聚集网络、特征延伸网络和多尺度行人检测网络组合进行端到端训练. 实验测试结果表明,该算法可以有效提高行人检测与定位准确性,并可在普通硬件设备条件下提供实时检测.
关键词:
特征图聚集,
行人检测,
多尺度,
空间分辨,
语义能力
|
|
[1] |
许言午, 曹先彬, 乔红 行人检测系统研究新进展及关键技术展望[J]. 电子学报, 2008, 36 (5): 962- 968 XU Yan-wu, CAO Xian-bin, QIAO hong Survey on the latest development of pedestrian detection system and its key technologies expectation[J]. Acta Electronica Sinica, 2008, 36 (5): 962- 968
doi: 10.3321/j.issn:0372-2112.2008.05.023
|
|
|
[2] |
王素玉, 沈兰荪 智能视觉监控技术研究进展[J]. 中国图象图形学报, 2008, 36 (5): 962- 968 WANG Su-yu, SHEN Lan-sun intelligent visual surveillance technology: a survey[J]. Journal of Image and Graphics, 2008, 36 (5): 962- 968
|
|
|
[3] |
乔传标, 王素玉, 卓力, 等 智能视觉监控中的目标检测与跟踪技术[J]. 测控技术, 2008, 27 (5): 22- 24 QIAO Chuan-biao, WANG Su-yu, ZHUO Li, et al Object detection and tracking for intelligent video surveillance[J]. Measurement and Control Technology, 2008, 27 (5): 22- 24
doi: 10.3969/j.issn.1000-8829.2008.05.007
|
|
|
[4] |
PAPAGEORGIOU C, POGGIO T A trainable system for object detection[J]. International Journal of Computer Vision, 2000, 38 (1): 15- 33
doi: 10.1023/A:1008162616689
|
|
|
[5] |
VIOLA P, JONES M J Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57 (2): 137- 154
doi: 10.1023/B:VISI.0000013087.49260.fb
|
|
|
[6] |
VIOLA P, JONES M J, SNOW D Detecting pedestrians using patterns of motion and appearance[J]. International Journal of Computer Vision, 2005, 63 (2): 153- 161
doi: 10.1007/s11263-005-6644-8
|
|
|
[7] |
VIOLA P, JONES M J, SNOW D. Detecting pedestrians using patterns of motion and appearance [C] // 9th IEEE International Conference on Computer Vision. Nice: IEEE, 2003: 734-741.
|
|
|
[8] |
DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C] // Computer Vision and Pattern Recognition. San Diego: IEEE, 2005: 886-893.
|
|
|
[9] |
WANG X Y, HAN T X, YAN S. An HOG-LBP human detector with partial occlusion handling [C] // 2009 IEEE 12th International Conference on Computer Vision. Kyoto: IEEE, 2009: 32-39.
|
|
|
[10] |
FREUND Y, SCHAPIRE R E. Experiments with a new boosting algorithm [C] // International Conference on Machine Learning. Bari: ICML, 1996: 148-156.
|
|
|
[11] |
GIRSHICK R B. Fast R-CNN [C] // IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
|
|
|
[12] |
GIRSHICK R B, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
|
|
|
[13] |
UIJLINGS J R R, VANDESANDE K E A, GEVERS T, et al Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104 (2): 154- 171
doi: 10.1007/s11263-013-0620-5
|
|
|
[14] |
REN S Q, HE K M, GIRSHICK R B, et al. Faster R-CNN: towards real-time object detection with region proposal networks [C] // Neural Information Processing Systems. Montreal: NIPS, 2015: 91-99.
|
|
|
[15] |
LIU W, ANGUELOV D, ERHAN D. SSD: single shot multibox detector [C] // European Conference on Computer Vision. Amsterdam: ECCV, 2016: 21-37.
|
|
|
[16] |
JOSEPH R, ALI F. YOLO9000: better, faster, stronger [C] // IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525.
|
|
|
[17] |
SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
|
|
|
[18] |
SZEGEDY C, IOFFE S, VANHOUCKE V. Inception-v4, Inception-ResNet and the impact of residual connections on learning [C] // Association for the Advancement of Artificial Intelligence. San Francisco: AAAI, 2017: 4278-4284
|
|
|
[19] |
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE 2016: 2818-2826.
|
|
|
[20] |
AHMED E, JONES M and MARKS T K. An improved deep learning architecture for person re-identification [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3908-3916.
|
|
|
[21] |
KONG T, YAO A B, CHEN Y R, et al. HyperNet: towards accurate region proposal generation and joint object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 845-853.
|
|
|
[22] |
IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C] // International Conference on Machine Learning. Lille: ICML, 2015: 448-456.
|
|
|
[23] |
LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
|
|
|
[24] |
ESS A, LEIBE B, VAN GOOL L. Depth and appearance for mobile scene analysis [C] // IEEE International Conference on Computer Vision. Rio de Janeiro: IEEE, 2007: 1-8.
|
|
|
[25] |
NAM W, DOLLAR P, HAN J H. Local decorrelation for improved pedestrian detection [C] // Neural Information Processing Systems. Montreal: NIPS, 2014: 424-432.
|
|
|
[26] |
LIM J, ZITNICK C, DOLLAR P. Sketch tokens: a learned mid-level representation for contour and object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 3158-3165.
|
|
|
[27] |
BENENSON R, MATHIAS M, TUYTELAARS T. Seeking the strongest rigid detector [C] // IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 3666-3673.
|
|
|
[28] |
MARIN J, VAZQUEZ D, LOPEZ A, et al. Random forests of local experts for pedestrian detection [C] // IEEE International Conference on Computer Vision. Sydney: IEEE, 2013: 2592-2599.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|