Small target detection algorithm in complex background
Pu ZHENG1(),Hong-yang BAI1,*(),Wei LI2,Hong-wei GUO1
1. School of Energy and Power Engineering, Nanjing University of Science and Technology, Nanjing 210094, China 2. 96037 PLA Troops, Baoji 721000, China
An improved single-shot-multibox-detector (SSD) algorithm was proposed. Referring to the feature pyramid networks (FPN) algorithm, the features of the Conv4-3 layer were merged with the features of Conv7 and Conv3-3 layers, and the number of default boxes at each location in merged feature map was increased. The squeeze-and-excitation networks (SENet) was added to the network structure; the feature channels of each layer were weighted, in order to enhance the useful feature weights and suppress the invalid feature weights. A series of enhancements were performed on the training data to enhance the generalization performance of the network. The experimental results show that the improved algorithm has a better performance on the VOC (07+12) dataset; the mean average precision (mAP) value of the improved algorithm is 80.4%, which is 2.7% higher than that of the original algorithm; the mAP value of the improved algorithm on COCO dataset (2017) is 42.5%, which is 2.3% higher than that of the original algorithm. Thus, the proposed algorithm can accurately detect the target with a size of at least 16×16 pixels.
Pu ZHENG,Hong-yang BAI,Wei LI,Hong-wei GUO. Small target detection algorithm in complex background. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1777-1784.
Fig.1Diagram of single-shot-multibox-detector(SSD)network model
Fig.2Comparison for feature map output of different layers in SSD network
Fig.3Characteristic thermal maps of different channels in SSD network
Fig.4Schematic diagram of feature fusion
Fig.5Diagram of squeeze-and-excitation network(SENet) structure
Fig.6Improved SSD network structure
Fig.7Comparison of precision-recall curves of three algorithms on different datasets and different categories (bottle, person, car, bird)
Fig.8Comparison of detection results of F_SE_SSD and SSD algorithm on VOC dataset
Fig.9Performance of improved algorithm(F_SE_SSD)under complex background
Fig.10Performance of improved algorithm(F_SE_SSD)in detecting small targets
[1]
YILMAZ A, JAVED O, SHAH M Object tracking: a survey[J]. ACM Computing Surveys, 2006, 38 (4): 1- 29
[2]
李旭冬, 叶茂, 李涛 基于卷积神经网络的目标检测研究综述[J]. 计算机应用研究, 2017, 34 (10): 2881- 2886 LI Xu-dong, YE Mao, LI Tao Review of object detection based on convolutional neural networks[J]. Application Research of Computers, 2017, 34 (10): 2881- 2886
doi: 10.3969/j.issn.1001-3695.2017.10.001
[3]
周晓彦, 王珂, 李凌燕 基于深度学习的目标检测算法综述[J]. 电子测量技术, 2017, 40 (11): 89- 93 ZHOU Xiao-yan, WANG Ke, LI Ling-yan Review of object detection based on deep learning[J]. Electronic Measurement Technology, 2017, 40 (11): 89- 93
doi: 10.3969/j.issn.1002-7300.2017.11.020
[4]
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C] // IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[5]
GIRSHICK R. Fast R-CNN [C] // IEEE Conference on Computer Vision and Pattern Recognition. Santiago: IEEE, 2015: 1440-1448.
[6]
REN S Q, HE K M, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39 (6): 1137- 1149
[7]
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE, 2016: 779-788.
[8]
LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C] // Proceedings of European Conference on Computer Vision. Amsterdam: ECCV, 2016: 21-37.
[9]
张焕龙, 胡士强, 杨国胜 基于外观模型学习的视频目标跟踪方法综述[J]. 计算机研究与发展, 2015, 52 (1): 177- 190 ZHANG Huan-long, HU Shi-qiang, YANG Guo-sheng Video object tracking based on appearance models learning[J]. Journal of Computer Research and Development, 2015, 52 (1): 177- 190
doi: 10.7544/issn1000-1239.2015.20130995
[10]
尹宏鹏, 陈波, 柴毅, 等 基于视觉的目标检测与跟踪综述[J]. 自动化学报, 2016, 42 (10): 1466- 1489 YIN Hong-peng, CHEN Bo, CHAI Yi, et al Vision-based object detection and tracking: a review[J]. Acta Automatica Sinica, 2016, 42 (10): 1466- 1489
[11]
葛宝义, 左宪章, 胡永江 视觉目标跟踪方法研究综述[J]. 中国图象图形学报, 2018, 23 (08): 1091- 1107 GE Bao-yi, ZUO Xian-zhang, HU Yong-jiang Review of visual object tracking technology[J]. Journal of Image and Graphics, 2018, 23 (08): 1091- 1107
[12]
方路平, 何杭江, 周国民 目标检测算法研究综述[J]. 计算机工程与应用, 2018, 54 (13): 11- 18 FANG Lu-ping, HE Hang-jiang, ZHOU Guo-min Research overview of object detection methods[J]. Computer Engineering and Applications, 2018, 54 (13): 11- 18
doi: 10.3778/j.issn.1002-8331.1804-0167
[13]
朱明明, 许悦雷, 马时平, 等 基于特征融合与软判决的遥感图像飞机检测[J]. 光学学报, 2019, 39 (2): 71- 77 ZHU Ming-ming, XU Yue-lei, et al Airplane detection based on feature fusion and soft decision in remote sensing images[J]. Acta Optica Sinica, 2019, 39 (2): 71- 77
[14]
辛鹏, 许悦雷, 唐红, 等 全卷积网络多层特征融合的飞机快速检测[J]. 光学学报, 2018, 38 (3): 344- 350 XIN Peng, XU Yue-lei, TANG Hong, et al Fast airplane detection based on multi-layer feature fusion of fully convolutional networks[J]. Acta Optica Sinica, 2018, 38 (3): 344- 350
LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C] // IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 936-944.
[17]
陈幻杰, 王琦琦, 杨国威, 等 多尺度卷积特征融合的SSD目标检测算法[J]. 计算机科学与探索, 2019, 13 (6): 1049- 1061 CHEN Huan-jie, WANG Qi-qi, YANG Guo-wei, et al SSD object detection algorithm with multi-scale convolution feature fusion[J]. Journal of Frontiers of Computer Science and Technology, 2019, 13 (6): 1049- 1061
[18]
ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks [C] // European Conference on Computer Vision. Zurich: ECCV, 2014: 818-833.
[19]
王俊强, 李建胜, 周学文, 等 改进的SSD算法及其对遥感影像小目标检测性能的分析[J]. 光学学报, 2019, 39 (6): 373- 382 WANG Jun-qiang, LI Jians-heng, ZHOU Xue-wen, et al Improved SSD algorithm and its performance analysis of small target detection in remote sensing images[J]. Acta Optica Sinica, 2019, 39 (6): 373- 382
[20]
张焯林, 赵建伟, 曹飞龙 构建带空洞卷积的深度神经网络重建高分辨率图像[J]. 模式识别与人工智能, 2019, 32 (3): 259- 267 ZHANG Zhuo-lin, ZHAO Jian-wei, CAO Fei-long Building deep neural networks with dilated convolutions to reconstruct high-resolution image[J]. Pattern Recognition and Artificial Intelligence, 2019, 32 (3): 259- 267
[21]
LONG J, SHELHAMER E, DARRELL T Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 39 (4): 640- 651
[22]
HU J, SHEN L, SAMUEL A, et al. Squeeze-and-excitation networks [J]. arXiv Preprint arXiv: 1709.01507, 2017.