基于特征优化与深层次融合的目标检测算法

doi:10.3785/j.issn.1008-973X.2022.12.009

浙江大学学报(工学版)

2022, Vol. 56

Issue (12): 2403-2415 DOI: 10.3785/j.issn.1008-973X.2022.12.009

计算机技术

基于特征优化与深层次融合的目标检测算法

谢誉1(

),包梓群1,张娜1,*(

),吴彪2,涂小妹1,3,包晓安1

1. 浙江理工大学计算机科学与技术学院，浙江杭州 310018
2. 浙江理工大学理学院，浙江杭州 310018
3. 浙江广厦建设职业技术大学建筑工程学院，浙江东阳 322100

Object detection algorithm based on feature enhancement and deep fusion

Yu XIE1(

),Zi-qun BAO1,Na ZHANG1,*(

),Biao WU2,Xiao-mei TU1,3,Xiao-an BAO1

1. School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
2. School of Science, Zhejiang Sci-Tech University, Hangzhou 310018, China
3. School of Civil Engineering and Architecture, Zhejiang Guangsha Vocational and Technical University of Construction, Dongyang 322100, China

全文: PDF(1565 KB) HTML

摘要：

针对单阶段多边框检测算法(SSD)存在对小目标检测误差较大的问题，提出基于特征优化与深层次融合的目标检测算法，通过空间通道特征增强(SCFE)模块和深层次特征金字塔网络(DFPN)改进SSD. SCFE模块基于局部空间特征增强和全局通道特征增强机制优化特征层，注重特征层的细节信息；DFPN基于残差空间通道增强模块改进特征金字塔网络，使不同尺度特征层进行深层次特征融合，提升目标检测精度. 在训练阶段添加样本加权训练策略，使网络注重训练定位良好的样本和置信度高的样本. 实验结果表明，在PASCAL VOC数据集上，所提算法在保证速度的同时检测精度由SSD的77.2%提升至79.7%；在COCO数据集上，所提算法的检测精度由SSD的25.6%提升至30.1%，对小目标的检测精度由SSD的6.8%提升至13.3%.

关键词： 目标检测; 深层次特征金字塔网络(DFPN); 空间通道特征增强(SCFE); 样本加权训练; 单阶段多边框检测算法(SSD)

Abstract:

A object detection algorithm based on feature optimization and deep fusion was proposed, aiming at the problems of single-stage multi-box detector algorithm (SSD) with large detection errors for small targets. SSD was improved through spatial and channel feature enhancement (SCFE) and deep feature pyramid network (DFPN). A feature layer based on the local spatial feature enhancement and the global channel feature enhancement mechanism was optimized by SCFE?module which focused on detail information of the feature layer. Based on the residual space channel enhancement module, feature?pyramid?network was?improved by DFPN which fused feature layers of different scales and improved the accuracy of object detection. At the same time, a sample weighted training strategy was added in the training stage, which made the network focused on training samples with good position and high confidence. The experimental results show that on the PASCAL VOC dataset, the detection accuracy of the proposed algorithm is improved from 77.2% to 79.7% of SSD while ensuring speed. On the COCO dataset, the detection accuracy of the proposed algorithm is increased from 25.6% to 30.1% for that of SSD, and the detection accuracy for small targets is increased from 6.8% to 13.3% for that of SSD.

Key words: object detection deep feature pyramid network (DFPN) spatial and channel feature enhancement (SCFE) sample weighted training single-stage multi-box detector algorithm (SSD)

收稿日期: 2022-01-05 出版日期: 2023-01-03

CLC:

TP 391

基金资助: 浙江省重点研发计划项目(2020C03094)；浙江省教育厅一般科研项目(Y202147659)；浙江省教育厅项目(Y202250706，Y202250677)；国家自然科学基金资助项目(6207050141)；浙江省基础公益研究计划项目(QY19E050003)

通讯作者: 张娜 E-mail: 1419352830@qq.com;zhangna@zstu.edu.cn

作者简介: 谢誉（1997—），男，硕士生，从事人工智能及计算机视觉信息处理研究. orcid.org/0000-0003-1067-3674.E-mail： 1419352830@qq.com

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	谢誉
	包梓群
	张娜
	吴彪
	涂小妹
	包晓安

引用本文:

谢誉,包梓群,张娜,吴彪,涂小妹,包晓安. 基于特征优化与深层次融合的目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2403-2415.

Yu XIE,Zi-qun BAO,Na ZHANG,Biao WU,Xiao-mei TU,Xiao-an BAO. Object detection algorithm based on feature enhancement and deep fusion. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2403-2415.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.12.009 或 https://www.zjujournals.com/eng/CN/Y2022/V56/I12/2403

图 1 3种改进的特征金字塔网络的结构图

图 2 基于特征优化与深层次融合的目标检测算法的网络结构

图 3 3个特征层各自随机抽取的1 000个特征点及其数值

表 1 特征金字塔网络中默认框的参数设置

图 4 空间通道特征增强模块结构图

图 5 原特征与特征增强后的热力图对比

图 6 特征金字塔结构

表 2 特征金字塔网络的输入与输出

图 7 2种残差空间与通道特征增强模块的结构图

图 8 DIoU效果示意图

图 9 DIoU分层局部排序

表 3 VOC2007测试集上平均检测精度对比

表 4 VOC2007测试集不同类别目标检测精度结果

图 10 2种算法在VOC2007数据集上检测结果对比

表 5 不同算法在COCO数据集上的实验结果

图 11 2种算法在COCO数据集上检测结果对比

图 12 空间通道注意力特征增强模块的消融实验

图 13 深层次特征金字塔网络的消融实验

表 6 所提算法不同模块的平均精度均值

表 7 通道注意力与空间注意力不同结合方式的评估实验结果

表 8 不同模块连接结构的平均精度均值

1	李雅倩, 盖成远, 肖存军, 等基于细化多尺度深度特征的目标检测网络[J]. 电子学报, 2020, 48 (12): 2360- 2366 LI Ya-qian, GAI Cheng-yuan, XIAO Cun-jun, et al Object detection network based on refined multi-scale depth features[J]. Acta Electronica Sinica, 2020, 48 (12): 2360- 2366 doi: 10.3969/j.issn.0372-2112.2020.12.011
2	郑浦, 白宏阳, 李伟, 等复杂背景下的小目标检测算法[J]. 浙江大学学报:工学版, 2020, 54 (9): 1777- 1784 ZHENG Pu, BAI Hong-yang, LI Wei, et al Small target detection algorithm in complex background[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (9): 1777- 1784
3	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// 2014 IEEE Conference on Computer Vison and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
4	GIRSHICK R. Fast R-CNN [C]// 2015 IEEE International Conference on Computer Vison. Santiago: IEEE, 2015: 1440-1448.
5	REN S, HE K GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149 doi: 10.1109/TPAMI.2016.2577031
6	CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection [C]// 2018 IEEE/CVF Conference on Computer Vison and Pattern Recognition. Salt Lake City: IEEE, 2018: 2603-2611.
7	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]// European Conference on Computer Vision. [S. l. ]: Springer, 2016: 21-37.
8	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
9	REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525.
10	REDMON J, FARHADI A. Yolov3: an incremental improvement. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1804.02767.pdf.
11	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vison and Pattern Recognition. Honolulu: IEEE, 2017: 963-944.
12	LI Z X, ZHOU F Q. FSSD: feature fusion single shot multibox detector [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1712. 00960.pdf.
13	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// 2017 IEEE International Conference on Computer Vison. Venice: IEEE, 2017: 2999-3007.
14	裴伟, 许晏铭, 朱永英, 等改进的SSD航拍目标检测方法[J]. 软件学报, 2019, 30 (3): 738- 758 PEI Wei, XU Yan-ming, ZHU Yong-ying, et al The target detection method of aerial photography images with improved SSD[J]. Journal of Software, 2019, 30 (3): 738- 758 doi: 10.13328/j.cnki.jos.005695
15	TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10778-10787.
16	GUO C, FAN B, ZHANG Q, et al. AugFPN: improving multi-scale feature learning for object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 12592-12601.
17	ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression. [C]// AAAI Conference on Artificial Intelligence. NewYork: AAAI, 2020: 12993–13000.
18	SIMON Y K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1409.1556.pdf.
19	陈科圻, 朱志亮, 邓小明, 等多尺度目标检测的深度学习研究综述[J]. 软件学报, 2021, 32 (4): 1201- 1227 CHEN Ke-qi, ZHU Zhi-liang, DENG Xiao-ming, et al Deep learning for multi-scale object detection: a survey[J]. Journal of Software, 2021, 32 (4): 1201- 1227 doi: 10.13328/j.cnki.jos.006166
20	WANG K, LIEW J H, ZOU Y, et al. PANet: few-shot image semantic segmentation with prototype alignment [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9197-9206.
21	GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7036-7045.
22	ZHANG Q, BAO X, WU B, et al Water meter pointer reading recognition method based on target-key point detection[J]. Flow Measurement and Instrumentation, 2021, 81: 102012 doi: 10.1016/j.flowmeasinst.2021.102012
23	HE J, SHEN L, ALBANIE S, et al Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42 (8): 2011- 2023 doi: 10.1109/TPAMI.2019.2913372
24	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// European Conference on Computer Vision. [S. l.]: Springer, 2018: 3-19.
25	ZHANG H, ZU K, LU J, et al. EPSANet: an efficient pyramid split attention block on convolutional neural network. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/ 2105.14447.pdf.
26	LIU W, RABINOVICH A, BERG A C. ParseNet: looking wider to see better. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1506.04579.pdf.
27	刘颖, 刘红燕, 范九伦, 等基于深度学习的小目标检测研究与应用综述[J]. 电子学报, 2020, 48 (3): 590- 601 LIU Ying, LIU Hong-yan, FAN Jiu-lun, et al A Survey of research and application of small object detection based on deep learning[J]. Acta Electronica Sinica, 2020, 48 (3): 590- 601 doi: 10.3969/j.issn.0372-2112.2020.03.024
28	QIN Z, LI Z, ZHANG Z, et al. ThunderNet: towards real-time generic object detection on mobile devices [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 6718-6727.
29	CAO Y, CHEN K, LOY C C, et al. Prime sample attention in object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11583-11591.
30	ZHOU P, NI B, GENG C, et al. Scale-transferrable object detection [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 528-537.
31	LI W, LIU G. A single-shot object detector with feature aggregation and enhancement [C]// 2019 IEEE International Conference on Image Processing. [S.l.]: IEEE, 2019: 3910-3914.
32	TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolutional one-stage object detection [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9627-9636.
33	ZHANG S, CHI C, YAO Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 9759-9768.
34	田秀霞, 李华强, 张琴, 等基于双通道R-FCN的图像篡改检测模型[J]. 计算机学报, 2021, 44 (2): 370- 383 TIAN Xiu-xia, LI Hua-qiang, ZHANG Qin, et al Dual-channel R-FCN model for image forgery detection[J]. Chinese Journal of Computers, 2021, 44 (2): 370- 383 doi: 10.11897/SP.J.1016.2021.00370
35	BOCHKOVSKIY A, WANG C Y, LIAO H Y M, et al. YOLOv4: optimal speed and accuracy of object detection. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/2004.10934.pdf.

[1]	张娜,戚旭磊,包晓安,吴彪,涂小妹,金瑜婷. 基于优化预测定位的单阶段目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(4): 783-794.
[2]	褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[3]	于楠晶,范晓飚,邓天民,冒国韬. 基于多头自注意力的复杂背景船舶检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2392-2402.
[4]	张云佐,郭威,蔡昭权,李文博. 联合多尺度与注意力机制的遥感图像目标检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2215-2223.
[5]	张融,张为. 基于改进GhostNet-FCOS的火灾检测算法[J]. 浙江大学学报(工学版), 2022, 56(10): 1891-1899.
[6]	周金海,周世镒,常阳,吴耿俊,王依川. 基于超宽带雷达基带信号的多人目标跟踪[J]. 浙江大学学报(工学版), 2021, 55(6): 1208-1214.
[7]	徐利锋,黄海帆,丁维龙,范玉雷. 基于改进DenseNet的水果小目标检测[J]. 浙江大学学报(工学版), 2021, 55(2): 377-385.
[8]	牛英杰,苏燕辰,程敦诚,廖家,赵海波,高永强. 高铁接触网U型抱箍螺母故障检测算法[J]. 浙江大学学报(工学版), 2021, 55(10): 1912-1921.
[9]	郑浦,白宏阳,李伟,郭宏伟. 复杂背景下的小目标检测算法[J]. 浙江大学学报(工学版), 2020, 54(9): 1777-1784.
[10]	张峻宁,苏群星,刘鹏远,王正军,谷宏强. 基于空间约束的自适应单目3D物体检测算法[J]. 浙江大学学报(工学版), 2020, 54(6): 1138-1146.
[11]	郑晨斌,张勇,胡杭,吴颖睿,黄广靖. 目标检测强化上下文模型[J]. 浙江大学学报(工学版), 2020, 54(3): 529-539.
[12]	晋耀,张为. 采用Anchor-Free网络结构的实时火灾检测算法[J]. 浙江大学学报(工学版), 2020, 54(12): 2430-2436.
[13]	林志洁,罗壮,赵磊,鲁东明. 特征金字塔多尺度全卷积目标检测算法[J]. 浙江大学学报(工学版), 2019, 53(3): 533-540.
[14]	叶芳芳,许力. 实时的静止目标与鬼影检测及判别方法[J]. 浙江大学学报(工学版), 2015, 49(1): 181-185.
[15]	刘辉涛,汪李明,李建龙. 声纳强脉冲干扰的自适应抵消方法[J]. J4, 2011, 45(3): 515-519.

Viewed

Full text

Abstract

Cited

Shared

Discussed