Small target detection algorithm in complex background

doi:10.3785/j.issn.1008-973X.2020.09.014

Journal of ZheJiang University (Engineering Science)

2020, Vol. 54

Issue (9): 1777-1784 DOI: 10.3785/j.issn.1008-973X.2020.09.014

Small target detection algorithm in complex background

Pu ZHENG1(

),Hong-yang BAI1,*(

),Wei LI2,Hong-wei GUO1

1. School of Energy and Power Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
2. 96037 PLA Troops, Baoji 721000, China

Download:

HTML

PDF(1443KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

An improved single-shot-multibox-detector (SSD) algorithm was proposed. Referring to the feature pyramid networks (FPN) algorithm, the features of the Conv4-3 layer were merged with the features of Conv7 and Conv3-3 layers, and the number of default boxes at each location in merged feature map was increased. The squeeze-and-excitation networks (SENet) was added to the network structure; the feature channels of each layer were weighted, in order to enhance the useful feature weights and suppress the invalid feature weights. A series of enhancements were performed on the training data to enhance the generalization performance of the network. The experimental results show that the improved algorithm has a better performance on the VOC (07+12) dataset; the mean average precision (mAP) value of the improved algorithm is 80.4%, which is 2.7% higher than that of the original algorithm; the mAP value of the improved algorithm on COCO dataset (2017) is 42.5%, which is 2.3% higher than that of the original algorithm. Thus, the proposed algorithm can accurately detect the target with a size of at least 16×16 pixels.

Key words： deep learning target detection single-shot-multibox-detector (SSD) algorithm feature fusion feature enhancement

Received: 28 August 2019 Published: 22 September 2020

CLC:

TP 391

Corresponding Authors: Hong-yang BAI E-mail: 117108022106@njust.edu.cn;hongyang@njust.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Pu ZHENG
	Hong-yang BAI
	Wei LI
	Hong-wei GUO

Cite this article:

Pu ZHENG,Hong-yang BAI,Wei LI,Hong-wei GUO. Small target detection algorithm in complex background. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1777-1784.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2020.09.014 OR http://www.zjujournals.com/eng/Y2020/V54/I9/1777

复杂背景下的小目标检测算法

提出一种改进的多类别单阶检测器（SSD）算法. 借鉴特征金字塔算法的思想，将Conv4-3层的特征与Conv7、Conv3-3层的特征进行融合，同时增加融合后特征图每个位置对应的默认框数量. 在网络结构中增加裁剪-权重分配网络（SENet），对每层的特征通道进行权重分配，提升有用的特征权重并抑制无效的特征权重. 为了增强网络的泛化能力，对训练数据集进行一系列增强处理. 实验结果表明，改进后的算法在VOC数据集（07+12）上的检测效果良好，平均精度均值为80.4%，比改进前的算法提高了2.7%；在COCO数据集（2017）上的平均精度均值为42.5%，比改进前的算法提高了2.3%. 所提算法能够准确检测出不小于16×16像素的目标.

关键词： 深度学习, 目标检测, 多类别单阶检测器（SSD）算法, 特征融合, 特征增强

Fig.1 Diagram of single-shot-multibox-detector（SSD）network model

Fig.2 Comparison for feature map output of different layers in SSD network

Fig.3 Characteristic thermal maps of different channels in SSD network

Fig.4 Schematic diagram of feature fusion

Fig.5 Diagram of squeeze-and-excitation network（SENet） structure

Fig.6 Improved SSD network structure

Fig.7 Comparison of precision-recall curves of three algorithms on different datasets and different categories (bottle, person, car, bird)

Fig.8 Comparison of detection results of F_SE_SSD and SSD algorithm on VOC dataset

Fig.9 Performance of improved algorithm（F_SE_SSD）under complex background

Fig.10 Performance of improved algorithm（F_SE_SSD）in detecting small targets


[1]	YILMAZ A, JAVED O, SHAH M Object tracking: a survey[J]. ACM Computing Surveys, 2006, 38 (4): 1- 29

[2]	李旭冬, 叶茂, 李涛基于卷积神经网络的目标检测研究综述[J]. 计算机应用研究, 2017, 34 (10): 2881- 2886 LI Xu-dong, YE Mao, LI Tao Review of object detection based on convolutional neural networks[J]. Application Research of Computers, 2017, 34 (10): 2881- 2886 doi: 10.3969/j.issn.1001-3695.2017.10.001

[3]	周晓彦, 王珂, 李凌燕基于深度学习的目标检测算法综述[J]. 电子测量技术, 2017, 40 (11): 89- 93 ZHOU Xiao-yan, WANG Ke, LI Ling-yan Review of object detection based on deep learning[J]. Electronic Measurement Technology, 2017, 40 (11): 89- 93 doi: 10.3969/j.issn.1002-7300.2017.11.020

[4]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.

[5]	GIRSHICK R. Fast R-CNN [C] // IEEE Conference on Computer Vision and Pattern Recognition. Santiago: IEEE, 2015: 1440-1448.

[6]	REN S Q, HE K M, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39 (6): 1137- 1149

[7]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE, 2016: 779-788.

[8]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C] // Proceedings of European Conference on Computer Vision. Amsterdam: ECCV, 2016: 21-37.

[9]	张焕龙, 胡士强, 杨国胜基于外观模型学习的视频目标跟踪方法综述[J]. 计算机研究与发展, 2015, 52 (1): 177- 190 ZHANG Huan-long, HU Shi-qiang, YANG Guo-sheng Video object tracking based on appearance models learning[J]. Journal of Computer Research and Development, 2015, 52 (1): 177- 190 doi: 10.7544/issn1000-1239.2015.20130995

[10]	尹宏鹏, 陈波, 柴毅, 等基于视觉的目标检测与跟踪综述[J]. 自动化学报, 2016, 42 (10): 1466- 1489 YIN Hong-peng, CHEN Bo, CHAI Yi, et al Vision-based object detection and tracking: a review[J]. Acta Automatica Sinica, 2016, 42 (10): 1466- 1489

[11]	葛宝义, 左宪章, 胡永江视觉目标跟踪方法研究综述[J]. 中国图象图形学报, 2018, 23 (08): 1091- 1107 GE Bao-yi, ZUO Xian-zhang, HU Yong-jiang Review of visual object tracking technology[J]. Journal of Image and Graphics, 2018, 23 (08): 1091- 1107

[12]	方路平, 何杭江, 周国民目标检测算法研究综述[J]. 计算机工程与应用, 2018, 54 (13): 11- 18 FANG Lu-ping, HE Hang-jiang, ZHOU Guo-min Research overview of object detection methods[J]. Computer Engineering and Applications, 2018, 54 (13): 11- 18 doi: 10.3778/j.issn.1002-8331.1804-0167

[13]	朱明明, 许悦雷, 马时平, 等基于特征融合与软判决的遥感图像飞机检测[J]. 光学学报, 2019, 39 (2): 71- 77 ZHU Ming-ming, XU Yue-lei, et al Airplane detection based on feature fusion and soft decision in remote sensing images[J]. Acta Optica Sinica, 2019, 39 (2): 71- 77

[14]	辛鹏, 许悦雷, 唐红, 等全卷积网络多层特征融合的飞机快速检测[J]. 光学学报, 2018, 38 (3): 344- 350 XIN Peng, XU Yue-lei, TANG Hong, et al Fast airplane detection based on multi-layer feature fusion of fully convolutional networks[J]. Acta Optica Sinica, 2018, 38 (3): 344- 350

[15]	朱敏超, 冯涛, 张钰基于FD-SSD的遥感图像多目标检测方法[J]. 计算机应用与软件, 2019, 36 (1): 232- 238 ZHU Min-chao, FENG Tao, ZHANG Yu Remote sensing image multi-target detection method based on FD-SSD[J]. Computer Applications and Software, 2019, 36 (1): 232- 238 doi: 10.3969/j.issn.1000-386x.2019.01.042

[16]	LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 936-944.

[17]	陈幻杰, 王琦琦, 杨国威, 等多尺度卷积特征融合的SSD目标检测算法[J]. 计算机科学与探索, 2019, 13 (6): 1049- 1061 CHEN Huan-jie, WANG Qi-qi, YANG Guo-wei, et al SSD object detection algorithm with multi-scale convolution feature fusion[J]. Journal of Frontiers of Computer Science and Technology, 2019, 13 (6): 1049- 1061

[18]	ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks [C] // European Conference on Computer Vision. Zurich: ECCV, 2014: 818-833.

[19]	王俊强, 李建胜, 周学文, 等改进的SSD算法及其对遥感影像小目标检测性能的分析[J]. 光学学报, 2019, 39 (6): 373- 382 WANG Jun-qiang, LI Jians-heng, ZHOU Xue-wen, et al Improved SSD algorithm and its performance analysis of small target detection in remote sensing images[J]. Acta Optica Sinica, 2019, 39 (6): 373- 382

[20]	张焯林, 赵建伟, 曹飞龙构建带空洞卷积的深度神经网络重建高分辨率图像[J]. 模式识别与人工智能, 2019, 32 (3): 259- 267 ZHANG Zhuo-lin, ZHAO Jian-wei, CAO Fei-long Building deep neural networks with dilated convolutions to reconstruct high-resolution image[J]. Pattern Recognition and Artificial Intelligence, 2019, 32 (3): 259- 267

[21]	LONG J, SHELHAMER E, DARRELL T Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 39 (4): 640- 651

[22]	HU J, SHEN L, SAMUEL A, et al. Squeeze-and-excitation networks [J]. arXiv Preprint arXiv: 1709.01507, 2017.

[1]	Jia-hui XU,Jing-chang WANG,Ling CHEN,Yong WU. Surface water quality prediction model based on graph neural network[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 601-607.

[2]	Hong-li WANG,Bin GUO,Si-cong LIU,Jia-qi LIU,Yun-gang WU,Zhi-wen YU. End context-adaptative deep sensing model with edge-end collaboration[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 626-638.

[3]	Teng ZHANG,Xin-long JIANG,Yi-qiang CHEN,Qian CHEN,Tao-mian MI,Piu CHAN. Wrist attitude-based Parkinson's disease ON/OFF state assessment after medication[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 639-647.

[4]	Li-feng XU,Hai-fan HUANG,Wei-long DING,Yu-lei FAN. Detection of small fruit target based on improved DenseNet[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 377-385.

[5]	Hao-can XU,Ji-tuo LI,Guo-dong LU. Reconstruction of three-dimensional human bodies from single image by LeNet-5[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 153-161.

[6]	Yi-peng HUANG,Ji-su HU,Xu-sheng QIAN,Zhi-yong ZHOU,Wen-lu ZHAO,Qi MA,Jun-kang SHEN,Ya-kang DAI. SE-Mask-RCNN: segmentation method for prostate cancer on multi-parametric MRI[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 203-212.

[7]	Qiao-hong CHEN,YI CHEN,Wen-shu Li,Yu-bo JIA. Clothing image classification based on multi-scale SE-Xception[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1727-1735.

[8]	Deng-wen ZHOU,Jin-yue TIAN,Lu-yao MA,Xiu-xiu SUN. Lightweight image semantic segmentation based on multi-level feature cascaded network[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(8): 1516-1524.

[9]	Tao MING,Dan WANG,Ji-chang GUO,Qiang LI. Breast cancer histopathological image classification using multi-scale channel squeeze-and-excitation model[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1289-1297.

[10]	Jun-ning ZHANFG,Qun-xing SU,Peng-yuan LIU,Zheng-jun WANG,Hong-qiang GU. Adaptive monocular 3D object detection algorithm based on spatial constraint[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(6): 1138-1146.

[11]	Xu YAN,Xiao-liang FAN,Chuan-pan ZHENG,Yu ZANG,Cheng WANG,Ming CHENG,Long-biao CHEN. Urban traffic flow prediction algorithm based on graph convolutional neural networks[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(6): 1147-1155.

[12]	Zhou-fei WANG,Wei-na YUAN. Channel estimation and detection method for multicarrier system based on deep learning[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(4): 732-738.

[13]	Bing YANG,Wen-bo MO,Jin-liang YAO. 3D palmprint recognition by using local features and deep learning[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(3): 540-545.

[14]	Yan-jia HONG,Tie-bao MENG,Hao-jiang LI,Li-zhi LIU,Li LI,Shuo-yu XU,Sheng-wen GUO. Deep segmentation method of tumor boundaries from MR images of patients with nasopharyngeal carcinoma using multi-modality and multi-dimension fusion[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(3): 566-573.

[15]	Zi-yu JIA,You-fang LIN,Hong-jun ZHANG,Jing WANG. Sleep stage classification model based ondeep convolutional neural network[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(10): 1899-1905.

Viewed

Full text

Abstract

Cited

Shared

Discussed