特征图聚集多尺度行人检测高效算法

doi:10.3785/j.issn.1008-973X.2019.06.022

浙江大学学报(工学版)

2019, Vol. 53

Issue (6): 1218-1224 DOI: 10.3785/j.issn.1008-973X.2019.06.022

计算机与自动化技术

特征图聚集多尺度行人检测高效算法

陈昀(

),蔡晓东*(

),梁晓曦,王萌

桂林电子科技大学信息与通信学院，广西桂林，541004

Efficient multi-scale pedestrian detection algorithm withfeature map aggregation

Yun CHEN(

),Xiao-dong CAI*(

),Xiao-xi LIANG,Meng WANG

School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China

全文: PDF(944 KB) HTML

摘要：

针对使用人工设计特征训练的行人检测算法准确率和效率较低的问题，提出一种采用卷积神经网络特征图聚集多尺度行人检测高效算法. 设计一种特征图聚集网络，将高层次特征图与低层次特征图进行聚集，构造出有较好空间分辨和语义能力的特征图；构造特征延伸网络，提供用于多尺度行人检测的特征图；重新设计目标候选区域，构造多尺度行人检测网络，提升定位准确性，并将特征图聚集网络、特征延伸网络和多尺度行人检测网络组合进行端到端训练. 实验测试结果表明，该算法可以有效提高行人检测与定位准确性，并可在普通硬件设备条件下提供实时检测.

关键词： 特征图聚集; 行人检测; 多尺度; 空间分辨; 语义能力

Abstract:

An efficient multi-scale pedestrian detection algorithm with convolutional neural network feature map aggregation was proposed for the problems of low accuracy and efficiency in pedestrian detection algorithm trained by manual design feature. An aggregation network was designed to gather high-level and low-level feature maps to construct a feature map with the ability of spatial resolution and semantic. And an extension network was constructed to provide feature maps for multi-scale detection. In addition, candidate areas were redesigned to construct a multi-scale detection network to improve positioning accuracy. The feature map aggregation network, extension network and multi-scale pedestrian detection network were combined for an end-to-end training. The experimental results show that, compared to algorithms based on manual design features, the proposed algorithm can effectively improve the accuracy of pedestrian detection and positioning. Under common hardware conditions, the proposed approach can provide real-time detection.

Key words: feature map aggregation pedestrian detection multi-scale spatial resolution semantic ability

收稿日期: 2018-05-09 出版日期: 2019-05-22

CLC:

TP 391

通讯作者: 蔡晓东 E-mail: 1655770801@qq.com;caixiaodong@guet.edu.cn

作者简介: 陈昀（1991—），男，硕士生，从事深度学习和图像处理研究. orcid.org/0000-0001-8438-5734. E-mail： 1655770801@qq.com

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	陈昀
	蔡晓东
	梁晓曦
	王萌

引用本文:

陈昀,蔡晓东,梁晓曦,王萌. 特征图聚集多尺度行人检测高效算法[J]. 浙江大学学报(工学版), 2019, 53(6): 1218-1224.

Yun CHEN,Xiao-dong CAI,Xiao-xi LIANG,Meng WANG. Efficient multi-scale pedestrian detection algorithm withfeature map aggregation. Journal of ZheJiang University (Engineering Science), 2019, 53(6): 1218-1224.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2019.06.022 或 http://www.zjujournals.com/eng/CN/Y2019/V53/I6/1218

图 1 基于视觉几何组（VGG）网络的低层与高层特征图聚集结构

图 2 基于多尺度卷积的特征延伸网络结构

图 3 基于多尺度特征图候选区域分类和边框回归的行人检测网络结构

图 4 INRIA行人数据集中的不同场景图片示例

图 5 ETH行人数据集中的不同场景图片示例

图 6 INRIA行人数据集中漏检率与每幅图像平均误检率变化曲线

表 1 INRIA行人数据集中不同方法的检测准确性对比

图 7 ETH行人数据集中漏检率与每幅图像平均误检率变化曲线

表 2 INRIA行人数据集中本文算法与SSD算法的定位准确性

表 3 不同算法每秒检测图片的数量对比

1	许言午, 曹先彬, 乔红行人检测系统研究新进展及关键技术展望[J]. 电子学报, 2008, 36 (5): 962- 968 XU Yan-wu, CAO Xian-bin, QIAO hong Survey on the latest development of pedestrian detection system and its key technologies expectation[J]. Acta Electronica Sinica, 2008, 36 (5): 962- 968 doi: 10.3321/j.issn:0372-2112.2008.05.023
2	王素玉, 沈兰荪智能视觉监控技术研究进展[J]. 中国图象图形学报, 2008, 36 (5): 962- 968 WANG Su-yu, SHEN Lan-sun intelligent visual surveillance technology: a survey[J]. Journal of Image and Graphics, 2008, 36 (5): 962- 968
3	乔传标, 王素玉, 卓力, 等智能视觉监控中的目标检测与跟踪技术[J]. 测控技术, 2008, 27 (5): 22- 24 QIAO Chuan-biao, WANG Su-yu, ZHUO Li, et al Object detection and tracking for intelligent video surveillance[J]. Measurement and Control Technology, 2008, 27 (5): 22- 24 doi: 10.3969/j.issn.1000-8829.2008.05.007
4	PAPAGEORGIOU C, POGGIO T A trainable system for object detection[J]. International Journal of Computer Vision, 2000, 38 (1): 15- 33 doi: 10.1023/A:1008162616689
5	VIOLA P, JONES M J Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57 (2): 137- 154 doi: 10.1023/B:VISI.0000013087.49260.fb
6	VIOLA P, JONES M J, SNOW D Detecting pedestrians using patterns of motion and appearance[J]. International Journal of Computer Vision, 2005, 63 (2): 153- 161 doi: 10.1007/s11263-005-6644-8
7	VIOLA P, JONES M J, SNOW D. Detecting pedestrians using patterns of motion and appearance [C] // 9th IEEE International Conference on Computer Vision. Nice: IEEE, 2003: 734-741.
8	DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C] // Computer Vision and Pattern Recognition. San Diego: IEEE, 2005: 886-893.
9	WANG X Y, HAN T X, YAN S. An HOG-LBP human detector with partial occlusion handling [C] // 2009 IEEE 12th International Conference on Computer Vision. Kyoto: IEEE, 2009: 32-39.
10	FREUND Y, SCHAPIRE R E. Experiments with a new boosting algorithm [C] // International Conference on Machine Learning. Bari: ICML, 1996: 148-156.
11	GIRSHICK R B. Fast R-CNN [C] // IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
12	GIRSHICK R B, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
13	UIJLINGS J R R, VANDESANDE K E A, GEVERS T, et al Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104 (2): 154- 171 doi: 10.1007/s11263-013-0620-5
14	REN S Q, HE K M, GIRSHICK R B, et al. Faster R-CNN: towards real-time object detection with region proposal networks [C] // Neural Information Processing Systems. Montreal: NIPS, 2015: 91-99.
15	LIU W, ANGUELOV D, ERHAN D. SSD: single shot multibox detector [C] // European Conference on Computer Vision. Amsterdam: ECCV, 2016: 21-37.
16	JOSEPH R, ALI F. YOLO9000: better, faster, stronger [C] // IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525.
17	SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
18	SZEGEDY C, IOFFE S, VANHOUCKE V. Inception-v4, Inception-ResNet and the impact of residual connections on learning [C] // Association for the Advancement of Artificial Intelligence. San Francisco: AAAI, 2017: 4278-4284
19	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE 2016: 2818-2826.
20	AHMED E, JONES M and MARKS T K. An improved deep learning architecture for person re-identification [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3908-3916.
21	KONG T, YAO A B, CHEN Y R, et al. HyperNet: towards accurate region proposal generation and joint object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 845-853.
22	IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C] // International Conference on Machine Learning. Lille: ICML, 2015: 448-456.
23	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
24	ESS A, LEIBE B, VAN GOOL L. Depth and appearance for mobile scene analysis [C] // IEEE International Conference on Computer Vision. Rio de Janeiro: IEEE, 2007: 1-8.
25	NAM W, DOLLAR P, HAN J H. Local decorrelation for improved pedestrian detection [C] // Neural Information Processing Systems. Montreal: NIPS, 2014: 424-432.
26	LIM J, ZITNICK C, DOLLAR P. Sketch tokens: a learned mid-level representation for contour and object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 3158-3165.
27	BENENSON R, MATHIAS M, TUYTELAARS T. Seeking the strongest rigid detector [C] // IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 3666-3673.
28	MARIN J, VAZQUEZ D, LOPEZ A, et al. Random forests of local experts for pedestrian detection [C] // IEEE International Conference on Computer Vision. Sydney: IEEE, 2013: 2592-2599.

[1]	余文韬,谢旭,成程. 焊接构造对T型接头超低周疲劳性能的影响[J]. 浙江大学学报(工学版), 2021, 55(1): 31-37.
[2]	陈巧红,陈翊,李文书,贾宇波. 多尺度SE-Xception服装图像分类[J]. 浙江大学学报(工学版), 2020, 54(9): 1727-1735.
[3]	明涛,王丹,郭继昌,李锵. 基于多尺度通道重校准的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2020, 54(7): 1289-1297.
[4]	孔庆盼,纪献兵,周儒鸿,尤天伢,徐进良. 亲-疏水两层结构表面强化蒸汽冷凝传热[J]. 浙江大学学报(工学版), 2020, 54(5): 1022-1028.
[5]	张淑芳,朱彤. 基于残差单发多框检测器模型的交通标志检测与识别[J]. 浙江大学学报(工学版), 2019, 53(5): 940-949.
[6]	林志洁,罗壮,赵磊,鲁东明. 特征金字塔多尺度全卷积目标检测算法[J]. 浙江大学学报(工学版), 2019, 53(3): 533-540.
[7]	胡天中,余建波. 基于多尺度分解和深度学习的锂电池寿命预测[J]. 浙江大学学报(工学版), 2019, 53(10): 1852-1864.
[8]	刘大龙, 冯冬芹. 采用多尺度主成分分析的控制系统欺骗攻击检测[J]. 浙江大学学报(工学版), 2018, 52(9): 1738-1746.
[9]	童水光, 张依东, 徐剑, 从飞云. 频带多尺度复合模糊熵及其在轴承故障诊断中的应用[J]. 浙江大学学报(工学版), 2018, 52(8): 1509-1516.
[10]	方钊, 李爱群, 李万润, 沈圣. 钢结构风致疲劳分析的多尺度有限元验证分析[J]. 浙江大学学报(工学版), 2018, 52(6): 1131-1139.
[11]	杨冰,王小华,杨鑫,黄孝喜. 基于HOG金字塔人脸识别方法[J]. 浙江大学学报(工学版), 2014, 48(9): 1564-1569.
[12]	牛辉,汪劲丰,张仪萍,张治成,俞亚南. 空间曲线蝶形拱桥顶推施工的多尺度模拟分析[J]. J4, 2013, 47(7): 1205-1212.
[13]	崔何亮, 张丹, 施斌. 布里渊分布式传感的空间分辨率及标定方法[J]. J4, 2013, 47(7): 1232-1237.
[14]	彭海,赵巨峰,冯华君,徐之海,李奇,陈跃庭. 基于区域显著性的双波段图像融合方法[J]. J4, 2012, 46(11): 2109-2115.
[15]	华岗, 陶煜波, 林海. 鲁棒多尺度地震层位识别与可视化[J]. J4, 2011, 45(10): 1697-1703.

Viewed

Full text

Abstract

Cited

Shared

Discussed