Object detection algorithm based on feature enhancement and deep fusion

doi:10.3785/j.issn.1008-973X.2022.12.009

Journal of ZheJiang University (Engineering Science)

2022, Vol. 56

Issue (12): 2403-2415 DOI: 10.3785/j.issn.1008-973X.2022.12.009

Object detection algorithm based on feature enhancement and deep fusion

Yu XIE1(

),Zi-qun BAO1,Na ZHANG1,*(

),Biao WU2,Xiao-mei TU1,3,Xiao-an BAO1

1. School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
2. School of Science, Zhejiang Sci-Tech University, Hangzhou 310018, China
3. School of Civil Engineering and Architecture, Zhejiang Guangsha Vocational and Technical University of Construction, Dongyang 322100, China

Download:

HTML

PDF(1565KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A object detection algorithm based on feature optimization and deep fusion was proposed, aiming at the problems of single-stage multi-box detector algorithm (SSD) with large detection errors for small targets. SSD was improved through spatial and channel feature enhancement (SCFE) and deep feature pyramid network (DFPN). A feature layer based on the local spatial feature enhancement and the global channel feature enhancement mechanism was optimized by SCFE?module which focused on detail information of the feature layer. Based on the residual space channel enhancement module, feature?pyramid?network was?improved by DFPN which fused feature layers of different scales and improved the accuracy of object detection. At the same time, a sample weighted training strategy was added in the training stage, which made the network focused on training samples with good position and high confidence. The experimental results show that on the PASCAL VOC dataset, the detection accuracy of the proposed algorithm is improved from 77.2% to 79.7% of SSD while ensuring speed. On the COCO dataset, the detection accuracy of the proposed algorithm is increased from 25.6% to 30.1% for that of SSD, and the detection accuracy for small targets is increased from 6.8% to 13.3% for that of SSD.

Key words： object detection deep feature pyramid network (DFPN) spatial and channel feature enhancement (SCFE) sample weighted training single-stage multi-box detector algorithm (SSD)

Received: 05 January 2022 Published: 03 January 2023

CLC:

TP 391

Fund: 浙江省重点研发计划项目(2020C03094)；浙江省教育厅一般科研项目(Y202147659)；浙江省教育厅项目(Y202250706，Y202250677)；国家自然科学基金资助项目(6207050141)；浙江省基础公益研究计划项目(QY19E050003)

Corresponding Authors: Na ZHANG E-mail: 1419352830@qq.com;zhangna@zstu.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Yu XIE
	Zi-qun BAO
	Na ZHANG
	Biao WU
	Xiao-mei TU
	Xiao-an BAO

Cite this article:

Yu XIE,Zi-qun BAO,Na ZHANG,Biao WU,Xiao-mei TU,Xiao-an BAO. Object detection algorithm based on feature enhancement and deep fusion. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2403-2415.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.12.009 OR https://www.zjujournals.com/eng/Y2022/V56/I12/2403

基于特征优化与深层次融合的目标检测算法

针对单阶段多边框检测算法(SSD)存在对小目标检测误差较大的问题，提出基于特征优化与深层次融合的目标检测算法，通过空间通道特征增强(SCFE)模块和深层次特征金字塔网络(DFPN)改进SSD. SCFE模块基于局部空间特征增强和全局通道特征增强机制优化特征层，注重特征层的细节信息；DFPN基于残差空间通道增强模块改进特征金字塔网络，使不同尺度特征层进行深层次特征融合，提升目标检测精度. 在训练阶段添加样本加权训练策略，使网络注重训练定位良好的样本和置信度高的样本. 实验结果表明，在PASCAL VOC数据集上，所提算法在保证速度的同时检测精度由SSD的77.2%提升至79.7%；在COCO数据集上，所提算法的检测精度由SSD的25.6%提升至30.1%，对小目标的检测精度由SSD的6.8%提升至13.3%.

关键词： 目标检测, 深层次特征金字塔网络(DFPN), 空间通道特征增强(SCFE), 样本加权训练, 单阶段多边框检测算法(SSD)

Fig.1 Structure diagram of three improved feature pyramid network

Fig.2 Structure diagram of single-stage object detection algorithm based on feature enhancement and deep fusion

Fig.3 One thousand randomly extracted feature points and their values for three feature layers each

Tab.1 Parameter settings of default box in feature pyramid network

Fig.4 Structure diagram of spatial channel feature enhancement module

Fig.5 Comparison between heat map of original feature and that of enhanced one

Fig.6 Feature pyramid structure

Tab.2 Input and output of feature pyramid network

Fig.7 Structural diagram of two residual spatial and channel attention feature enhancement modules

Fig.8 Schematic diagram of effect of DIoU

Fig.9 DIoU hierarchical local sorting

Tab.3 Comparison of mean average precision on VOC2007 test set

Tab.4 Different types of target detection accuracy results on VOC2007 test set

Fig.10 Comparison of detection results of two algorithms on VOC2007 dataset

Tab.5 Experiment results of different algorithms on COCO dataset

Fig.11 Comparison of detection results of two algorithms on COCO datase

Fig.12 Ablation experiment of spatial and channel attention feature enhancement module

Fig.13 Ablation experiment of deep feature pyramid network

Tab.6 Average precision means of different modules of proposed algorithm

Tab.7 Evaluation of Experimental results different combinations of channel attention and spatial attention

Tab.8 Average accuracy means of different module connection structures


[1]	李雅倩, 盖成远, 肖存军, 等基于细化多尺度深度特征的目标检测网络[J]. 电子学报, 2020, 48 (12): 2360- 2366 LI Ya-qian, GAI Cheng-yuan, XIAO Cun-jun, et al Object detection network based on refined multi-scale depth features[J]. Acta Electronica Sinica, 2020, 48 (12): 2360- 2366 doi: 10.3969/j.issn.0372-2112.2020.12.011

[2]	郑浦, 白宏阳, 李伟, 等复杂背景下的小目标检测算法[J]. 浙江大学学报:工学版, 2020, 54 (9): 1777- 1784 ZHENG Pu, BAI Hong-yang, LI Wei, et al Small target detection algorithm in complex background[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (9): 1777- 1784

[3]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// 2014 IEEE Conference on Computer Vison and Pattern Recognition. Columbus: IEEE, 2014: 580-587.

[4]	GIRSHICK R. Fast R-CNN [C]// 2015 IEEE International Conference on Computer Vison. Santiago: IEEE, 2015: 1440-1448.

[5]	REN S, HE K GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149 doi: 10.1109/TPAMI.2016.2577031

[6]	CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection [C]// 2018 IEEE/CVF Conference on Computer Vison and Pattern Recognition. Salt Lake City: IEEE, 2018: 2603-2611.

[7]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]// European Conference on Computer Vision. [S. l. ]: Springer, 2016: 21-37.

[8]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.

[9]	REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525.

[10]	REDMON J, FARHADI A. Yolov3: an incremental improvement. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1804.02767.pdf.

[11]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vison and Pattern Recognition. Honolulu: IEEE, 2017: 963-944.

[12]	LI Z X, ZHOU F Q. FSSD: feature fusion single shot multibox detector [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1712. 00960.pdf.

[13]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// 2017 IEEE International Conference on Computer Vison. Venice: IEEE, 2017: 2999-3007.

[14]	裴伟, 许晏铭, 朱永英, 等改进的SSD航拍目标检测方法[J]. 软件学报, 2019, 30 (3): 738- 758 PEI Wei, XU Yan-ming, ZHU Yong-ying, et al The target detection method of aerial photography images with improved SSD[J]. Journal of Software, 2019, 30 (3): 738- 758 doi: 10.13328/j.cnki.jos.005695

[15]	TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10778-10787.

[16]	GUO C, FAN B, ZHANG Q, et al. AugFPN: improving multi-scale feature learning for object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 12592-12601.

[17]	ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression. [C]// AAAI Conference on Artificial Intelligence. NewYork: AAAI, 2020: 12993–13000.

[18]	SIMON Y K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1409.1556.pdf.

[19]	陈科圻, 朱志亮, 邓小明, 等多尺度目标检测的深度学习研究综述[J]. 软件学报, 2021, 32 (4): 1201- 1227 CHEN Ke-qi, ZHU Zhi-liang, DENG Xiao-ming, et al Deep learning for multi-scale object detection: a survey[J]. Journal of Software, 2021, 32 (4): 1201- 1227 doi: 10.13328/j.cnki.jos.006166

[20]	WANG K, LIEW J H, ZOU Y, et al. PANet: few-shot image semantic segmentation with prototype alignment [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9197-9206.

[21]	GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7036-7045.

[22]	ZHANG Q, BAO X, WU B, et al Water meter pointer reading recognition method based on target-key point detection[J]. Flow Measurement and Instrumentation, 2021, 81: 102012 doi: 10.1016/j.flowmeasinst.2021.102012

[23]	HE J, SHEN L, ALBANIE S, et al Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42 (8): 2011- 2023 doi: 10.1109/TPAMI.2019.2913372

[24]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// European Conference on Computer Vision. [S. l.]: Springer, 2018: 3-19.

[25]	ZHANG H, ZU K, LU J, et al. EPSANet: an efficient pyramid split attention block on convolutional neural network. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/ 2105.14447.pdf.

[26]	LIU W, RABINOVICH A, BERG A C. ParseNet: looking wider to see better. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1506.04579.pdf.

[27]	刘颖, 刘红燕, 范九伦, 等基于深度学习的小目标检测研究与应用综述[J]. 电子学报, 2020, 48 (3): 590- 601 LIU Ying, LIU Hong-yan, FAN Jiu-lun, et al A Survey of research and application of small object detection based on deep learning[J]. Acta Electronica Sinica, 2020, 48 (3): 590- 601 doi: 10.3969/j.issn.0372-2112.2020.03.024

[28]	QIN Z, LI Z, ZHANG Z, et al. ThunderNet: towards real-time generic object detection on mobile devices [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 6718-6727.

[29]	CAO Y, CHEN K, LOY C C, et al. Prime sample attention in object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11583-11591.

[30]	ZHOU P, NI B, GENG C, et al. Scale-transferrable object detection [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 528-537.

[31]	LI W, LIU G. A single-shot object detector with feature aggregation and enhancement [C]// 2019 IEEE International Conference on Image Processing. [S.l.]: IEEE, 2019: 3910-3914.

[32]	TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolutional one-stage object detection [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9627-9636.

[33]	ZHANG S, CHI C, YAO Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 9759-9768.

[34]	田秀霞, 李华强, 张琴, 等基于双通道R-FCN的图像篡改检测模型[J]. 计算机学报, 2021, 44 (2): 370- 383 TIAN Xiu-xia, LI Hua-qiang, ZHANG Qin, et al Dual-channel R-FCN model for image forgery detection[J]. Chinese Journal of Computers, 2021, 44 (2): 370- 383 doi: 10.11897/SP.J.1016.2021.00370

[35]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M, et al. YOLOv4: optimal speed and accuracy of object detection. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/2004.10934.pdf.

[1]	Na ZHANG,Xu-lei QI,Xiao-an BAO,Biao WU,Xiao-mei TU,Yu-ting JIN. Single-stage object detection algorithm based on optimizing position prediction[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 783-794.

[2]	Jing-hui CHU,Li-dong SHI,Pei-guang JING,Wei LV. Context-aware knowledge distillation network for object detection[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 503-509.

[3]	Nan-jing YU,Xiao-biao FAN,Tian-min DENG,Guo-tao MAO. Ship detection algorithm in complex backgrounds via multi-head self-attention[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2392-2402.

[4]	Rong ZHANG,Wei ZHANG. Fire detection algorithm based on improved GhostNet-FCOS[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(10): 1891-1899.

[5]	Kai DU,Guo-rong ZHU,Jiang-hua LU,Mu-ye PANG. Metal object detection method in wireless electric vehicle charging system[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 56-62.

[6]	Ying-jie NIU,Yan-chen SU,Dun-cheng CHENG,Jia LIAO,Hai-bo ZHAO,Yong-qiang GAO. High-speed rail contact network U-holding nut fault detection algorithm[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(10): 1912-1921.

[7]	Ying-jie XIA,Cong-yu OUYANG. Dynamic image background modeling method for detecting abandoned objects in highway[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1249-1255.

[8]	Chen-bin ZHENG,Yong ZHANG,Hang HU,Ying-rui WU,Guang-jing HUANG. Object detection enhanced context model[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(3): 529-539.

[9]	Yao JIN,Wei ZHANG. Real-time fire detection algorithm with Anchor-Free network architecture[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(12): 2430-2436.

[10]	YE Fang-fang, XU Li. Real-time detection and discrimination of static objects and ghosts[J]. Journal of ZheJiang University (Engineering Science), 2015, 49(1): 181-185.

[11]	XU Xue-mei, LI Li-xian, ZHANG Jian-yang, NI Lan, HUANG Zheng-yu, CAO Jian. Tracking algorithm of visible particles in transparent liquid pharmaceutical[J]. Journal of ZheJiang University (Engineering Science), 2012, 46(10): 1822-1830.

Viewed

Full text

Abstract

Cited

Shared

Discussed