Please wait a minute...
浙江大学学报(工学版)  2024, Vol. 58 Issue (10): 1992-2000    DOI: 10.3785/j.issn.1008-973X.2024.10.002
计算机与控制工程     
基于改进YOLOv5的枸杞虫害检测
杜丁健1(),高遵海1,*(),陈倬2
1. 武汉轻工大学 数学与计算机学院,湖北 武汉,430048
2. 武汉轻工大学 管理学院,湖北 武汉,430048
Wolfberry pest detection based on improved YOLOv5
Dingjian DU1(),Zunhai GAO1,*(),Zhuo CHEN2
1. School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430048, China
2. School of Management, Wuhan Polytechnic University, Wuhan 430048, China
 全文: PDF(3603 KB)   HTML
摘要:

为了检测复杂环境下枸杞的虫害情况,提出基于改进YOLOv5m的模型. 以下一代视觉转换器 (Next-ViT)作为骨干网络,提高模型的特征提取能力,使模型更加关注关键目标特征. 在模型颈部增加自适应融合的上下文增强模块,增强模型对上下文信息的理解与处理能力,提高模型对小目标(蚜虫)的检测精度. 将颈部网络中的C3模块替换为C3_Faster模块,减少模型占用量并进一步提高模型检测精度. 实验结果表明,所提模型的准确率和召回率分别为97.0%、92.1%,平均精度均值为94.7%;相比于YOLOv5m,所提模型的平均精度均值提高了1.9个百分点,蚜虫的检测平均精度提高了9.4个百分点. 对比不同模型的平均精度均值,所提模型比主流模型YOLOv7、YOLOX、DETR、EfficientDet-D1、Cascade R-CNN分别高1.6、1.6、2.8、3.5、1.0个百分点. 所提模型在提高检测性能的同时,模型占用量也保持在合理范围内.

关键词: 枸杞虫害深度学习小目标检测YOLOv5下一代视觉转换器(Next-ViT)    
Abstract:

A model based on improved YOLOv5m was proposed for wolfberry pest detection in a complex environment. The next generation vision transformer (Next-ViT) was used as the backbone network to improve the feature extraction ability of the model, and the key target features were given more attention by the model. An adaptive fusion context enhancement module was added to the neck to enhance the model’s ability to understand and process contextual information, and the precision of the model for the small object (aphids) detection was improved. The C3 module in the neck network was replaced by using the C3_Faster module to reduce the model footprint and further improve the model precision. Experimental results showed that the proposed model achieved a precision of 97.0% and a recall of 92.1%. The mean average precision (mAP50) was 94.7%, which was 1.9 percentage points higher than that of the YOLOv5m, and the average precision of aphid detection was improved by 9.4 percentage points. The mAP50 of different models were compared and the proposed was 1.6, 1.6, 2.8, 3.5, and 1.0 percentage points higher than the mainstream models YOLOv7, YOLOX, DETR, EfficientDet-D1, and Cascade R-CNN, respectively. The proposed model improves the detection performance while maintaining a reasonable model footprint.

Key words: wolfberry pest    deep learning    small object detection    YOLOv5    next generation vision transformer (Next-ViT)
收稿日期: 2023-09-27 出版日期: 2024-09-27
CLC:  TP 391.4  
基金资助: 湖北省社会科学基金资助项目(21ZD072).
通讯作者: 高遵海     E-mail: ddj1670687939@163.com;haigao007@whpu.edu.cn
作者简介: 杜丁健(2000—),男,硕士生,从事目标检测、深度学习的农业工程应用研究. orcid.org/0009-0006-4470-5020. E-mail:ddj1670687939@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
杜丁健
高遵海
陈倬

引用本文:

杜丁健,高遵海,陈倬. 基于改进YOLOv5的枸杞虫害检测[J]. 浙江大学学报(工学版), 2024, 58(10): 1992-2000.

Dingjian DU,Zunhai GAO,Zhuo CHEN. Wolfberry pest detection based on improved YOLOv5. Journal of ZheJiang University (Engineering Science), 2024, 58(10): 1992-2000.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2024.10.002        https://www.zjujournals.com/eng/CN/Y2024/V58/I10/1992

图 1  枸杞虫害示例图
类别n
训练集验证集测试集
尺蠖59115192
大青叶蝉51013882
负泥虫51813084
毛跳甲52213181
蚜虫40910061
合计2 550650400
表 1  枸杞虫害数据集的图像种类
图 2  YOLOv5m的网络结构
图 3  下一代视觉转换器整体架构图
图 4  上下文增强模块的结构图
图 5  上下文增强模块的位置图
图 6  自适应融合结构图
图 7  FaterNet 模块和部分卷积的结构图
图 8  NCF-YOLO枸杞虫害检测模型的整体结构图
图 9  模型改进前后的虫害特征热力图对比
模型AP/%P/%R/%mAP50/%M/MB
尺蠖大青叶蝉负泥虫蚜虫毛田甲
YOLOv5m99.099.487.678.699.495.888.892.842.2
YOLOv5-P99.498.484.879.299.394.990.292.259.09
YOLOv5-E99.198.184.372.798.494.686.890.556.2
YOLOv5-N99.599.287.184.198.994.792.093.765.8
YOLOv5-NC(加权融合)99.599.585.285.999.296.492.493.966.8
YOLOv5-NC(自适应融合)99.499.587.387.199.397.092.194.566.8
YOLOv5-NC(级联融合)99.499.585.487.298.296.191.394.066.9
NCF-YOLO99.499.587.388.099.397.092.194.757.4
表 2  模型改进前后的检测性能对比
图 10  模型改进前后蚜虫的检测效果对比图
图 11  负泥虫的错检情况图
模型mAP50/%M/MB
YOLOv5m92.842.20
YOLOv3[25]90.3123.50
YOLOv7[26]93.174.80
YOLOX[27]93.1130.16
DETR[28]91.9186.20
EfficientDet-D1[29]91.226.80
Cascade R-CNN[30]93.7527.20
NCF-YOLO94.757.40
表 3  不同模型的检测性能对比
1 陈磊, 刘立波, 王晓丽. 2020年宁夏枸杞虫害图文跨模态检索数据集[J]. 中国科学数据(中英文网络版), 2022, 7(3): 149−156.
CHEN Lei, LIU Libo, WANG Xiaoli. A dataset of image-text cross-modal retrieval of Lycium barbarum pests in Ningxia in 2020 [J]. China Scientific Data . 2022, 7(3): 149−156.
2 王云露. 基于深度迁移学习的苹果病害识别方法研究[D]. 泰安: 山东农业大学, 2022: 2−14.
WANG Yunlu. Apple disease identification method based on deep transfer learning [D]. Tai’an: Shandong Agricultural University, 2022: 2−14.
3 胡林龙. 基于图像处理的甘蓝型油菜的虫害程度与识别的研究[D]. 武汉: 武汉轻工大学, 2020: 10−45.
HU Linlong. Study on pest degree and recognition of brassica napus based on image processing [D]. Wuhan: Wuhan Polytechnic University, 2020: 10−45.
4 EBRAHIMI M A, KHOSHTAGHAZA M H, MINAEI S, et al Vision-based pest detection based on SVM classification method[J]. Computers and Electronics in Agriculture, 2017, 137: 52- 58
doi: 10.1016/j.compag.2017.03.016
5 WEN C, CHEN H, MA Z, et al Pest-YOLO: a model for large-scale multi-class dense and tiny pest detection and counting[J]. Frontiers in Plant Science, 2022, 13: 973985
doi: 10.3389/fpls.2022.973985
6 BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection [EB/OL]. (2020−04−23)[2023−06−22]. https://arxiv.org/pdf/2004.10934.
7 王金星, 马博, 王震, 等 基于改进Mask R-CNN的苹果园害虫识别方法[J]. 农业机械学报, 2023, 54 (6): 253- 263
WANG Jinxing, MA Bo, WANG Zhen, et al Pest identification method in apple orchard based on improved Mask R-CNN[J]. Transactions of the Chinese Society of Agricultural Machinery, 2023, 54 (6): 253- 263
doi: 10.6041/j.issn.1000-1298.2023.06.026
8 HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN [C]// Proceedings of the IEEE International Conference on Computer Vision . Venice: IEEE, 2017: 2961−2969.
9 WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision . Munich: Springer, 2018: 3−19.
10 XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 1492−1500.
11 王卫星, 刘泽乾, 高鹏, 等 基于改进YOLO v4的荔枝病虫害检测模型[J]. 农业机械学报, 2023, 54 (5): 227- 235
WANG Weixing, LIU Zeqian, GAO Peng, et al Detection of litchi diseases and insect pests based on improved YOLO v4 model[J]. Transactions of the Chinese Society of Agricultural Machinery, 2023, 54 (5): 227- 235
doi: 10.6041/j.issn.1000-1298.2023.05.023
12 HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 1580−1589.
13 苏虹. 枸杞病虫害识别方法的研究与设计[D]. 银川: 宁夏大学, 2019: 1−2.
SU Hong. Research and design of recognition algorithm for wolfberry pests and diseases [D]. Yinchuan: Ningxia University, 2019: 1−2.
14 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 779−788.
15 JOCHER G. YOLOv5: minor version 6.0 [EB/OL]. (2021−10−12) [2023−06−22]. https://github.com/ultralytics/yolov5/releases/tag/v6.0.
16 LI J, XIA X, LI W, et al. Next-ViT: next generation vision transformer for efficient deployment in realistic industrial scenarios [EB/OL]. (2022−08−16)[2023−06−22]. https://arxiv.org/pdf/2207.05501.
17 CHEN J, KAO S H, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver: IEEE, 2023: 12021−12031.
18 沈守娟, 郑广浩, 彭译萱, 等 基于YOLOv3算法的教室学生检测与人数统计方法[J]. 软件导刊, 2020, 19 (9): 78- 83
SHEN Shoujuan, PENG Yixuan, ZHENG Guanghao, et al Crowd detection and statistical methods based on YOLOv3 algorithm in classroom scenes[J]. Software Guide, 2020, 19 (9): 78- 83
19 DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale [EB/OL]. (2020−06−03)[2023−06−22]. https://arxiv.org/pdf/2010.11929.
20 YU W, LUO M, ZHOU P, et al. MetaFormer is actually what you need for vision [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans: IEEE, 2022: 10819−10829.
21 LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 2117−2125.
22 SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization [C]// Proceedings of the IEEE International Conference Computer Vision . Venice: IEEE, 2017.
23 LI Y, HU J, WEN Y, et al. Rethinking vision transformers for MobileNet size and speed [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Paris: IEEE, 2023: 16889−16900.
24 邵明月, 张建华, 冯全, 等 深度学习在植物叶部病害检测与识别的研究进展[J]. 智慧农业(中英文), 2022, 4 (1): 29- 46
SHAO Mingyue, ZHANG Jianhua, FENG Quan, et al Research progress of deep learning in detection and recognition of plant leaf diseases[J]. Smart Agriculture, 2022, 4 (1): 29- 46
25 REDMON J, FARHADI A. YOLOv3: an incremental improvement [EB/OL]. (2018−04−08)[2023−06−22]. https://arxiv.org/pdf/1804.02767.
26 WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies set new state-of-the-art for real-time object detectors [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Patterns Recognized . Vancouver: IEEE, 2023: 7464−7475.
27 GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021 [EB/OL]. (2021−08−06)[2023−06−22]. https://arxiv.org/pdf/2107.08430
28 CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers [C]// Proceedings of European Conference on Computer Vision . Glasgow: Springer, 2020: 213−229.
29 TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 10781−10790.
[1] 李凡,杨杰,冯志成,陈智超,付云骁. 基于图像识别的弓网接触点检测方法[J]. 浙江大学学报(工学版), 2024, 58(9): 1801-1810.
[2] 冉庆东,郑力新. 基于改进YOLOv5的锂电池极片缺陷检测方法[J]. 浙江大学学报(工学版), 2024, 58(9): 1811-1821.
[3] 张凌嘉,周欣磊,许月萍,江衍铭. 基于YOLOv5和Mask-RCNN组合模型的社交媒体内涝灾害分析[J]. 浙江大学学报(工学版), 2024, 58(9): 1822-1831.
[4] 肖力,曹志刚,卢浩冉,黄志坚,蔡袁强. 基于深度学习和梯度优化的弹性超材料设计[J]. 浙江大学学报(工学版), 2024, 58(9): 1892-1901.
[5] 吴书晗,王丹,陈远方,贾子钰,张越棋,许萌. 融合注意力的滤波器组双视图图卷积运动想象脑电分类[J]. 浙江大学学报(工学版), 2024, 58(7): 1326-1335.
[6] 李林睿,王东升,范红杰. 基于法条知识的事理型类案检索方法[J]. 浙江大学学报(工学版), 2024, 58(7): 1357-1365.
[7] 马现伟,范朝辉,聂为之,李东,朱逸群. 对失效传感器具备鲁棒性的故障诊断方法[J]. 浙江大学学报(工学版), 2024, 58(7): 1488-1497.
[8] 宋娟,贺龙喜,龙会平. 基于深度学习的隧道衬砌多病害检测算法[J]. 浙江大学学报(工学版), 2024, 58(6): 1161-1173.
[9] 钟博,王鹏飞,王乙乔,王晓玲. 基于深度学习的EEG数据分析技术综述[J]. 浙江大学学报(工学版), 2024, 58(5): 879-890.
[10] 魏翠婷,赵唯坚,孙博超,刘芸怡. 基于改进Mask R-CNN与双目视觉的智能配筋检测[J]. 浙江大学学报(工学版), 2024, 58(5): 1009-1019.
[11] 宦海,盛宇,顾晨曦. 基于遥感图像道路提取的全局指导多特征融合网络[J]. 浙江大学学报(工学版), 2024, 58(4): 696-707.
[12] 罗向龙,王亚飞,王彦博,王立新. 基于双向门控式宽度学习系统的监测数据结构变形预测[J]. 浙江大学学报(工学版), 2024, 58(4): 729-736.
[13] 张会娟,李坤鹏,姬淼鑫,刘振江,刘建娟,张弛. 基于空间相关性增强的无人机检测算法[J]. 浙江大学学报(工学版), 2024, 58(3): 468-479.
[14] 宋明俊,严文,邓益昭,张俊然,涂海燕. 轻量化机器人抓取位姿实时检测算法[J]. 浙江大学学报(工学版), 2024, 58(3): 599-610.
[15] 钱庆杰,余军合,战洪飞,王瑞,胡健. 基于DL-BiGRU多特征融合的注塑件尺寸预测方法[J]. 浙江大学学报(工学版), 2024, 58(3): 646-654.