|
|
Object detection algorithm based on feature enhancement and deep fusion |
Yu XIE1(),Zi-qun BAO1,Na ZHANG1,*(),Biao WU2,Xiao-mei TU1,3,Xiao-an BAO1 |
1. School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China 2. School of Science, Zhejiang Sci-Tech University, Hangzhou 310018, China 3. School of Civil Engineering and Architecture, Zhejiang Guangsha Vocational and Technical University of Construction, Dongyang 322100, China |
|
|
Abstract A object detection algorithm based on feature optimization and deep fusion was proposed, aiming at the problems of single-stage multi-box detector algorithm (SSD) with large detection errors for small targets. SSD was improved through spatial and channel feature enhancement (SCFE) and deep feature pyramid network (DFPN). A feature layer based on the local spatial feature enhancement and the global channel feature enhancement mechanism was optimized by SCFE?module which focused on detail information of the feature layer. Based on the residual space channel enhancement module, feature?pyramid?network was?improved by DFPN which fused feature layers of different scales and improved the accuracy of object detection. At the same time, a sample weighted training strategy was added in the training stage, which made the network focused on training samples with good position and high confidence. The experimental results show that on the PASCAL VOC dataset, the detection accuracy of the proposed algorithm is improved from 77.2% to 79.7% of SSD while ensuring speed. On the COCO dataset, the detection accuracy of the proposed algorithm is increased from 25.6% to 30.1% for that of SSD, and the detection accuracy for small targets is increased from 6.8% to 13.3% for that of SSD.
|
Received: 05 January 2022
Published: 03 January 2023
|
|
Fund: 浙江省重点研发计划项目(2020C03094);浙江省教育厅一般科研项目(Y202147659); 浙江省教育厅项目(Y202250706,Y202250677);国家自然科学基金资助项目(6207050141);浙江省基础公益研究计划项目(QY19E050003) |
Corresponding Authors:
Na ZHANG
E-mail: 1419352830@qq.com;zhangna@zstu.edu.cn
|
基于特征优化与深层次融合的目标检测算法
针对单阶段多边框检测算法(SSD)存在对小目标检测误差较大的问题,提出基于特征优化与深层次融合的目标检测算法,通过空间通道特征增强(SCFE)模块和深层次特征金字塔网络(DFPN)改进SSD. SCFE模块基于局部空间特征增强和全局通道特征增强机制优化特征层,注重特征层的细节信息;DFPN基于残差空间通道增强模块改进特征金字塔网络,使不同尺度特征层进行深层次特征融合,提升目标检测精度. 在训练阶段添加样本加权训练策略,使网络注重训练定位良好的样本和置信度高的样本. 实验结果表明,在PASCAL VOC数据集上,所提算法在保证速度的同时检测精度由SSD的77.2%提升至79.7%;在COCO数据集上,所提算法的检测精度由SSD的25.6%提升至30.1%,对小目标的检测精度由SSD的6.8%提升至13.3%.
关键词:
目标检测,
深层次特征金字塔网络(DFPN),
空间通道特征增强(SCFE),
样本加权训练,
单阶段多边框检测算法(SSD)
|
|
[1] |
李雅倩, 盖成远, 肖存军, 等 基于细化多尺度深度特征的目标检测网络[J]. 电子学报, 2020, 48 (12): 2360- 2366 LI Ya-qian, GAI Cheng-yuan, XIAO Cun-jun, et al Object detection network based on refined multi-scale depth features[J]. Acta Electronica Sinica, 2020, 48 (12): 2360- 2366
doi: 10.3969/j.issn.0372-2112.2020.12.011
|
|
|
[2] |
郑浦, 白宏阳, 李伟, 等 复杂背景下的小目标检测算法[J]. 浙江大学学报:工学版, 2020, 54 (9): 1777- 1784 ZHENG Pu, BAI Hong-yang, LI Wei, et al Small target detection algorithm in complex background[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (9): 1777- 1784
|
|
|
[3] |
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// 2014 IEEE Conference on Computer Vison and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
|
|
|
[4] |
GIRSHICK R. Fast R-CNN [C]// 2015 IEEE International Conference on Computer Vison. Santiago: IEEE, 2015: 1440-1448.
|
|
|
[5] |
REN S, HE K GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149
doi: 10.1109/TPAMI.2016.2577031
|
|
|
[6] |
CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection [C]// 2018 IEEE/CVF Conference on Computer Vison and Pattern Recognition. Salt Lake City: IEEE, 2018: 2603-2611.
|
|
|
[7] |
LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]// European Conference on Computer Vision. [S. l. ]: Springer, 2016: 21-37.
|
|
|
[8] |
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
|
|
|
[9] |
REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6517-6525.
|
|
|
[10] |
REDMON J, FARHADI A. Yolov3: an incremental improvement. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1804.02767.pdf.
|
|
|
[11] |
LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vison and Pattern Recognition. Honolulu: IEEE, 2017: 963-944.
|
|
|
[12] |
LI Z X, ZHOU F Q. FSSD: feature fusion single shot multibox detector [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1712. 00960.pdf.
|
|
|
[13] |
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// 2017 IEEE International Conference on Computer Vison. Venice: IEEE, 2017: 2999-3007.
|
|
|
[14] |
裴伟, 许晏铭, 朱永英, 等 改进的SSD航拍目标检测方法[J]. 软件学报, 2019, 30 (3): 738- 758 PEI Wei, XU Yan-ming, ZHU Yong-ying, et al The target detection method of aerial photography images with improved SSD[J]. Journal of Software, 2019, 30 (3): 738- 758
doi: 10.13328/j.cnki.jos.005695
|
|
|
[15] |
TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10778-10787.
|
|
|
[16] |
GUO C, FAN B, ZHANG Q, et al. AugFPN: improving multi-scale feature learning for object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 12592-12601.
|
|
|
[17] |
ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression. [C]// AAAI Conference on Artificial Intelligence. NewYork: AAAI, 2020: 12993–13000.
|
|
|
[18] |
SIMON Y K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1409.1556.pdf.
|
|
|
[19] |
陈科圻, 朱志亮, 邓小明, 等 多尺度目标检测的深度学习研究综述[J]. 软件学报, 2021, 32 (4): 1201- 1227 CHEN Ke-qi, ZHU Zhi-liang, DENG Xiao-ming, et al Deep learning for multi-scale object detection: a survey[J]. Journal of Software, 2021, 32 (4): 1201- 1227
doi: 10.13328/j.cnki.jos.006166
|
|
|
[20] |
WANG K, LIEW J H, ZOU Y, et al. PANet: few-shot image semantic segmentation with prototype alignment [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9197-9206.
|
|
|
[21] |
GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7036-7045.
|
|
|
[22] |
ZHANG Q, BAO X, WU B, et al Water meter pointer reading recognition method based on target-key point detection[J]. Flow Measurement and Instrumentation, 2021, 81: 102012
doi: 10.1016/j.flowmeasinst.2021.102012
|
|
|
[23] |
HE J, SHEN L, ALBANIE S, et al Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42 (8): 2011- 2023
doi: 10.1109/TPAMI.2019.2913372
|
|
|
[24] |
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// European Conference on Computer Vision. [S. l.]: Springer, 2018: 3-19.
|
|
|
[25] |
ZHANG H, ZU K, LU J, et al. EPSANet: an efficient pyramid split attention block on convolutional neural network. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/ 2105.14447.pdf.
|
|
|
[26] |
LIU W, RABINOVICH A, BERG A C. ParseNet: looking wider to see better. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/1506.04579.pdf.
|
|
|
[27] |
刘颖, 刘红燕, 范九伦, 等 基于深度学习的小目标检测研究与应用综述[J]. 电子学报, 2020, 48 (3): 590- 601 LIU Ying, LIU Hong-yan, FAN Jiu-lun, et al A Survey of research and application of small object detection based on deep learning[J]. Acta Electronica Sinica, 2020, 48 (3): 590- 601
doi: 10.3969/j.issn.0372-2112.2020.03.024
|
|
|
[28] |
QIN Z, LI Z, ZHANG Z, et al. ThunderNet: towards real-time generic object detection on mobile devices [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 6718-6727.
|
|
|
[29] |
CAO Y, CHEN K, LOY C C, et al. Prime sample attention in object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11583-11591.
|
|
|
[30] |
ZHOU P, NI B, GENG C, et al. Scale-transferrable object detection [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 528-537.
|
|
|
[31] |
LI W, LIU G. A single-shot object detector with feature aggregation and enhancement [C]// 2019 IEEE International Conference on Image Processing. [S.l.]: IEEE, 2019: 3910-3914.
|
|
|
[32] |
TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolutional one-stage object detection [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9627-9636.
|
|
|
[33] |
ZHANG S, CHI C, YAO Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 9759-9768.
|
|
|
[34] |
田秀霞, 李华强, 张琴, 等 基于双通道R-FCN的图像篡改检测模型[J]. 计算机学报, 2021, 44 (2): 370- 383 TIAN Xiu-xia, LI Hua-qiang, ZHANG Qin, et al Dual-channel R-FCN model for image forgery detection[J]. Chinese Journal of Computers, 2021, 44 (2): 370- 383
doi: 10.11897/SP.J.1016.2021.00370
|
|
|
[35] |
BOCHKOVSKIY A, WANG C Y, LIAO H Y M, et al. YOLOv4: optimal speed and accuracy of object detection. [EB/OL]. [2021-12-30]. https://arxiv.org/pdf/2004.10934.pdf.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|