Please wait a minute...
浙江大学学报(工学版)  2021, Vol. 55 Issue (2): 377-385    DOI: 10.3785/j.issn.1008-973X.2021.02.018
计算机与控制工程     
基于改进DenseNet的水果小目标检测
徐利锋(),黄海帆,丁维龙,范玉雷
浙江工业大学 计算机科学与技术学院,浙江 杭州 310023
Detection of small fruit target based on improved DenseNet
Li-feng XU(),Hai-fan HUANG,Wei-long DING,Yu-lei FAN
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
 全文: PDF(889 KB)   HTML
摘要:

针对自然环境中小目标水果的检测精度普遍较低的问题,提出基于DenseNet改进的水果目标检测框架. 构建以DenseNet为核心的多尺度特征提取模块,在DenseNet不同层级的稠密块中建立特征金字塔结构,加强网络层特征复用. 结合低层特征的高分辨率和高层特征的高语义性,实现准确定位和预测小目标水果存在的目的. 引入软阈值非极大值抑制(Soft-NMS)算法,改善簇状果实结构中检测框被误剔除的情况. 与常用的Faster R-CNN网络相比,所提出的框架在苹果、芒果和杏3个数据集中的平均检测速度大于40 FPS,F1值分别为0.920、0.928、0.831,实现了检测效率及精度的提升.

关键词: DenseNet深度学习水果小目标检测特征金字塔网络 (FPN)软阈值非极大值抑制 (Soft-NMS)    
Abstract:

An improved fruit detection framework based on DenseNet was proposed aiming at the problem that small fruit target detection always obtains low accuracy in natrual environment. A multi-scale feature extraction module was built with DenseNet. A feature pyramid structure was used in dense blocks at different scales of DenseNet in order to strength the network layer feature reuse. Low-level features with high resolution and high-level features with high semantics were combined to achieve accurate localization and prediction of the existence of small fruits. Soft non-maximum suppression (Soft-NMS) algorithm was introduced to avoid the case that detection boxes were mistakenly removed in the clustered fruit structure. In three datasets of apple, mango and almond, the detection speed came up to 40 FPS, and the F1 score reached 0.920, 0.928 and 0.831 with the proposed framework. The detection efficiency and accuracy were improved compared with the commonly used Faster R-CNN network.

Key words: DenseNet    deep learning    small fruit target detection    feature pyramid network (FPN)    soft non-maximum suppression (Soft-NMS)
收稿日期: 2020-09-02 出版日期: 2021-03-09
CLC:  TP 399  
基金资助: 国家自然科学基金资助项目(61571400,61702456);浙江省自然科学基金资助项目(LY18C130012)
作者简介: 徐利锋(1983—),男,讲师,博士,从事植物虚拟建模及图像处理研究. orcid.org/0000-0003-4957-7559. E-mail: lfxu@zjut.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
徐利锋
黄海帆
丁维龙
范玉雷

引用本文:

徐利锋,黄海帆,丁维龙,范玉雷. 基于改进DenseNet的水果小目标检测[J]. 浙江大学学报(工学版), 2021, 55(2): 377-385.

Li-feng XU,Hai-fan HUANG,Wei-long DING,Yu-lei FAN. Detection of small fruit target based on improved DenseNet. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 377-385.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2021.02.018        http://www.zjujournals.com/eng/CN/Y2021/V55/I2/377

图 1  DenseNet网络结构
图 2  特征金字塔结构
图 3  水果图像中检测框重叠现象
图 4  改进网络结构图
图 5  改进Dense模块
Dense 网络层 通道数 参数 输出尺寸/像素
阶段1 Conv 16 3×3/1 512×512
Pool 16 2×2/2 256×256
Conv 32 3×3/1 256×256
Pool 32 2×2/2 128×128
Dense Block 32 1×1/1 Conv 128×128
16 3×3/1 Conv 128×128
阶段2 Conv 64 3×3/1 128×128
Pool 64 2×2/2 64×64
Dense Block 64 1×1/1 Conv 64×64
32 3×3/1 Conv 64×64
阶段3 Conv 128 3×3/1 64×64
Pool 128 2×2/2 32×32
Dense Block 128 1×1/1 Conv 32×32
64 3×3/1 Conv 32×32
阶段4 Conv 256 3×3/1 32×32
Pool 256 2×2/2 16×16
Dense Block 256 1×1/1 Conv 16×16
128 3×3/1 Conv 16×16
Conv 512 3×3/1 16×16
Pool 512 2×2/1 16×16
Conv 1024 3×3/1 16×16
Conv 30 1×1/1 16×16
表 1  目标检测网络参数
图 6  目标检测网络结构
水果种类 分辨率/像素 训练集个数 验证集个数/测试集个数
苹果 200×308 729 112/112
芒果 500×500 1154 270/270
300×300 385 100/100
表 2  水果数据集参数
水果 P(Faster R-CNN) P(本研究) R(本研究) F1(本研究)
苹果 90.3 93.19 91.14 0.920
芒果 90.8 93.59 92.03 0.928
77.5 84.31 81.95 0.831
表 3  本研究方法与Faster R-CNN对比
图 7  水果检测可视化结果
图 8  苹果和芒果数据遮挡场景检测
水果 O /% C /%
本研究 文献[8] 本研究 文献[8]
苹果 90.37 86.48 92.46 90.75
芒果 88.14 85.91 90.83 89.10
表 4  遮挡场景下本研究方法和文献[8]方法网络检测能力的对比
水果 F1
VGG16(文献[8]) VGG16(本研究) ResNet
苹果 0.904 0.879 0.887
芒果 0.908 0.880 0.892
0.775 0.739 0.748
表 5  原数据集实验复现及F1值计算
算法工况 P R F1
Faster R-CNN ? ? 0.775
无FPN结构 79.77 78.40 0.791
有FPN结构 83.53 81.94 0.828
表 6  有/无FPN结构的检测结果对比
水果 P
NMS Soft-NMS
苹果 92.96 93.17
芒果 93.18 93.35
84.07 84.42
表 7  NMS与Soft-NMS检测精度的对比结果
水果 Ru /%
实验1 实验2 实验3 实验4 实验5
苹果 0.18 0.09 0.2 0.14 0.14
芒果 0.37 0.32 0.41 0.34 0.35
0.63 0.55 0.65 0.61 0.60
表 8  不同参数下Soft-NMS的提升率
1 NUSKE S, ACHAR S, BATES T, et al. Yield estimation in vineyards by visual grape detection [C]// 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. San Francisco: IEEE, 2011: 2352-2358.
2 PAYNE A B, WALSH K B, SUBEDI P P, et al Estimation of mango crop yield using image analysis: segmentation method[J]. Computers and Electronics in Agriculture, 2013, 91: 57- 64
doi: 10.1016/j.compag.2012.11.009
3 INKYU S, ZONGYUAN G, FERAS D, et al Deep fruits: a fruit detection system using deep neural networks[J]. Sensors, 2016, 16 (8): 1222
doi: 10.3390/s16081222
4 GIRSHICK R, DONAHUE J, DARRELL T, et al Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38 (1): 142- 158
5 EVERINGHAM M, GOOL L V, WILLIAMS C K I, et al The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88 (2): 303- 338
doi: 10.1007/s11263-009-0275-4
6 UIJLINGS J R R, SANDE K E A V D, GEVERS T, et al Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104 (2): 154- 171
doi: 10.1007/s11263-013-0620-5
7 REN S, HE K, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149
8 BARGOTI S, UNDERWOOD J. Deep fruit detection in orchards [C]// 2017 IEEE International Conference on Robotics and Automation. Singapore: IEEE, 2017: 3626-3633.
9 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
10 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [J/OL]. [2020-08-16]. https://arxiv.org/abs/1409.1556.
11 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
12 曾平平, 李林升 基于卷积神经网络的水果图像分类识别研究[J]. 机械设计与研究, 2019, 35 (1): 23- 26
ZENG Ping-ping, LI Lin-sheng Classification and recognition of common fruit images based on convolutional neural network[J]. Machine Design and Research, 2019, 35 (1): 23- 26
13 薛月菊, 黄宁, 涂淑琴, 等 未成熟芒果的改进YOLOv2识别方法[J]. 农业工程学报, 2018, 34 (7): 173- 179
XUE Yue-ju, HUANG Ning, TU Shu-qin, et al Immature mango detection based on improved YOLOv2[J]. Transactions of the Chinese Society of Agricultural Engineering, 2018, 34 (7): 173- 179
doi: 10.11975/j.issn.1002-6819.2018.07.022
14 REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 7263-7271.
15 WAN S, GOUDOS S Faster R-CNN for multi-class fruit detection using a robotic vision system[J]. Computer Networks, 2020, 168: 107036
doi: 10.1016/j.comnet.2019.107036
16 MAI X, ZHANG H, JIA X, et al Faster R-CNN with classifier fusion for automatic detection of small fruits[J]. IEEE Transactions on Automation Science and Engineering, 2020, PP (99): 1- 15
17 SRIVASTAVA R K, GREFF K, SCHMIDHUBER J. Training very deep networks [C]// Advances in Neural Information Processing Systems. Montreal: Curran Associates, 2015: 2377-2385.
18 LARSSON G, MAIRE M, SHAKHNAROVICH G. Fractalnet: ultra-deep neural networks without residuals [J/OL]. [2020-08-16]. https://arxiv.org/abs/1605.07648.
19 HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 4700-4708.
20 LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2117-2125.
21 NEUBECK A, VAN GOOL L. Efficient non-maximum suppression [C]// 18th International Conference on Pattern Recognition. Hong Kong: IEEE, 2006, 3: 850-855.
22 BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS: improving object detection with one line of code [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 5561-5569.
23 MATHIAS M, BENENSON R, TIMOFTE R, et al. Handling occlusions with franken-classifiers [C]// Proceedings of the IEEE International Conference on Computer Vision. Sydeny: IEEE, 2013: 1505-1512.
24 NING C, MENGLU L, HAO Y, et al. Survey of pedestrian detection with occlusion [J/OL]. Complex and Intelligent Systems. https://doi.org/10.1007/s40747-020-00206-8
[1] 许佳辉,王敬昌,陈岭,吴勇. 基于图神经网络的地表水水质预测模型[J]. 浙江大学学报(工学版), 2021, 55(4): 601-607.
[2] 王虹力,郭斌,刘思聪,刘佳琪,仵允港,於志文. 边端融合的终端情境自适应深度感知模型[J]. 浙江大学学报(工学版), 2021, 55(4): 626-638.
[3] 张腾,蒋鑫龙,陈益强,陈前,米涛免,陈彪. 基于腕部姿态的帕金森病用药后开-关期检测[J]. 浙江大学学报(工学版), 2021, 55(4): 639-647.
[4] 许豪灿,李基拓,陆国栋. 由LeNet-5从单张着装图像重建三维人体[J]. 浙江大学学报(工学版), 2021, 55(1): 153-161.
[5] 黄毅鹏,胡冀苏,钱旭升,周志勇,赵文露,马麒,沈钧康,戴亚康. SE-Mask-RCNN:多参数MRI前列腺癌分割方法[J]. 浙江大学学报(工学版), 2021, 55(1): 203-212.
[6] 郑浦,白宏阳,李伟,郭宏伟. 复杂背景下的小目标检测算法[J]. 浙江大学学报(工学版), 2020, 54(9): 1777-1784.
[7] 陈巧红,陈翊,李文书,贾宇波. 多尺度SE-Xception服装图像分类[J]. 浙江大学学报(工学版), 2020, 54(9): 1727-1735.
[8] 周登文,田金月,马路遥,孙秀秀. 基于多级特征并联的轻量级图像语义分割[J]. 浙江大学学报(工学版), 2020, 54(8): 1516-1524.
[9] 明涛,王丹,郭继昌,李锵. 基于多尺度通道重校准的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2020, 54(7): 1289-1297.
[10] 闫旭,范晓亮,郑传潘,臧彧,王程,程明,陈龙彪. 基于图卷积神经网络的城市交通态势预测算法[J]. 浙江大学学报(工学版), 2020, 54(6): 1147-1155.
[11] 汪周飞,袁伟娜. 基于深度学习的多载波系统信道估计与检测[J]. 浙江大学学报(工学版), 2020, 54(4): 732-738.
[12] 杨冰,莫文博,姚金良. 融合局部特征与深度学习的三维掌纹识别[J]. 浙江大学学报(工学版), 2020, 54(3): 540-545.
[13] 洪炎佳,孟铁豹,黎浩江,刘立志,李立,徐硕瑀,郭圣文. 多模态多维信息融合的鼻咽癌MR图像肿瘤深度分割方法[J]. 浙江大学学报(工学版), 2020, 54(3): 566-573.
[14] 贾子钰,林友芳,张宏钧,王晶. 基于深度卷积神经网络的睡眠分期模型[J]. 浙江大学学报(工学版), 2020, 54(10): 1899-1905.
[15] 王万良,杨小涵,赵燕伟,高楠,吕闯,张兆娟. 采用卷积自编码器网络的图像增强算法[J]. 浙江大学学报(工学版), 2019, 53(9): 1728-1740.