Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2024, Vol. 58 Issue (10): 1992-2000    DOI: 10.3785/j.issn.1008-973X.2024.10.002
    
Wolfberry pest detection based on improved YOLOv5
Dingjian DU1(),Zunhai GAO1,*(),Zhuo CHEN2
1. School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430048, China
2. School of Management, Wuhan Polytechnic University, Wuhan 430048, China
Download: HTML     PDF(3603KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A model based on improved YOLOv5m was proposed for wolfberry pest detection in a complex environment. The next generation vision transformer (Next-ViT) was used as the backbone network to improve the feature extraction ability of the model, and the key target features were given more attention by the model. An adaptive fusion context enhancement module was added to the neck to enhance the model’s ability to understand and process contextual information, and the precision of the model for the small object (aphids) detection was improved. The C3 module in the neck network was replaced by using the C3_Faster module to reduce the model footprint and further improve the model precision. Experimental results showed that the proposed model achieved a precision of 97.0% and a recall of 92.1%. The mean average precision (mAP50) was 94.7%, which was 1.9 percentage points higher than that of the YOLOv5m, and the average precision of aphid detection was improved by 9.4 percentage points. The mAP50 of different models were compared and the proposed was 1.6, 1.6, 2.8, 3.5, and 1.0 percentage points higher than the mainstream models YOLOv7, YOLOX, DETR, EfficientDet-D1, and Cascade R-CNN, respectively. The proposed model improves the detection performance while maintaining a reasonable model footprint.



Key wordswolfberry pest      deep learning      small object detection      YOLOv5      next generation vision transformer (Next-ViT)     
Received: 27 September 2023      Published: 27 September 2024
CLC:  TP 391.4  
  S 435.112  
Fund:  湖北省社会科学基金资助项目(21ZD072).
Corresponding Authors: Zunhai GAO     E-mail: ddj1670687939@163.com;haigao007@whpu.edu.cn
Cite this article:

Dingjian DU,Zunhai GAO,Zhuo CHEN. Wolfberry pest detection based on improved YOLOv5. Journal of ZheJiang University (Engineering Science), 2024, 58(10): 1992-2000.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2024.10.002     OR     https://www.zjujournals.com/eng/Y2024/V58/I10/1992


基于改进YOLOv5的枸杞虫害检测

为了检测复杂环境下枸杞的虫害情况,提出基于改进YOLOv5m的模型. 以下一代视觉转换器 (Next-ViT)作为骨干网络,提高模型的特征提取能力,使模型更加关注关键目标特征. 在模型颈部增加自适应融合的上下文增强模块,增强模型对上下文信息的理解与处理能力,提高模型对小目标(蚜虫)的检测精度. 将颈部网络中的C3模块替换为C3_Faster模块,减少模型占用量并进一步提高模型检测精度. 实验结果表明,所提模型的准确率和召回率分别为97.0%、92.1%,平均精度均值为94.7%;相比于YOLOv5m,所提模型的平均精度均值提高了1.9个百分点,蚜虫的检测平均精度提高了9.4个百分点. 对比不同模型的平均精度均值,所提模型比主流模型YOLOv7、YOLOX、DETR、EfficientDet-D1、Cascade R-CNN分别高1.6、1.6、2.8、3.5、1.0个百分点. 所提模型在提高检测性能的同时,模型占用量也保持在合理范围内.


关键词: 枸杞虫害,  深度学习,  小目标检测,  YOLOv5,  下一代视觉转换器(Next-ViT) 
Fig.1 Example diagram of wolfberry pest
类别n
训练集验证集测试集
尺蠖59115192
大青叶蝉51013882
负泥虫51813084
毛跳甲52213181
蚜虫40910061
合计2 550650400
Tab.1 Image types in wolfberry pest dataset
Fig.2 Network structure of YOLOv5m
Fig.3 Overall architecture diagram of next generation vision transformer
Fig.4 Structure diagram of context augmentation module
Fig.5 Location map of context augmentation module
Fig.6 Architecture diagram of adaptive fusion
Fig.7 Structure diagram of FaterNet block and partial convolution
Fig.8 Overall structure diagram of NCF-YOLO wolfberry pest detection model
Fig.9 Comparison of heat maps of pest characteristics before and after model improvement
模型AP/%P/%R/%mAP50/%M/MB
尺蠖大青叶蝉负泥虫蚜虫毛田甲
YOLOv5m99.099.487.678.699.495.888.892.842.2
YOLOv5-P99.498.484.879.299.394.990.292.259.09
YOLOv5-E99.198.184.372.798.494.686.890.556.2
YOLOv5-N99.599.287.184.198.994.792.093.765.8
YOLOv5-NC(加权融合)99.599.585.285.999.296.492.493.966.8
YOLOv5-NC(自适应融合)99.499.587.387.199.397.092.194.566.8
YOLOv5-NC(级联融合)99.499.585.487.298.296.191.394.066.9
NCF-YOLO99.499.587.388.099.397.092.194.757.4
Tab.2 Comparison of detection performance before and after model improvement
Fig.10 Comparison of detection effect of aphids before and after model improvement
Fig.11 Mistake detection for negative mudbugs chart
模型mAP50/%M/MB
YOLOv5m92.842.20
YOLOv3[25]90.3123.50
YOLOv7[26]93.174.80
YOLOX[27]93.1130.16
DETR[28]91.9186.20
EfficientDet-D1[29]91.226.80
Cascade R-CNN[30]93.7527.20
NCF-YOLO94.757.40
Tab.3 Comparison of detection performance for different models
[1]   陈磊, 刘立波, 王晓丽. 2020年宁夏枸杞虫害图文跨模态检索数据集[J]. 中国科学数据(中英文网络版), 2022, 7(3): 149−156.
CHEN Lei, LIU Libo, WANG Xiaoli. A dataset of image-text cross-modal retrieval of Lycium barbarum pests in Ningxia in 2020 [J]. China Scientific Data . 2022, 7(3): 149−156.
[2]   王云露. 基于深度迁移学习的苹果病害识别方法研究[D]. 泰安: 山东农业大学, 2022: 2−14.
WANG Yunlu. Apple disease identification method based on deep transfer learning [D]. Tai’an: Shandong Agricultural University, 2022: 2−14.
[3]   胡林龙. 基于图像处理的甘蓝型油菜的虫害程度与识别的研究[D]. 武汉: 武汉轻工大学, 2020: 10−45.
HU Linlong. Study on pest degree and recognition of brassica napus based on image processing [D]. Wuhan: Wuhan Polytechnic University, 2020: 10−45.
[4]   EBRAHIMI M A, KHOSHTAGHAZA M H, MINAEI S, et al Vision-based pest detection based on SVM classification method[J]. Computers and Electronics in Agriculture, 2017, 137: 52- 58
doi: 10.1016/j.compag.2017.03.016
[5]   WEN C, CHEN H, MA Z, et al Pest-YOLO: a model for large-scale multi-class dense and tiny pest detection and counting[J]. Frontiers in Plant Science, 2022, 13: 973985
doi: 10.3389/fpls.2022.973985
[6]   BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection [EB/OL]. (2020−04−23)[2023−06−22]. https://arxiv.org/pdf/2004.10934.
[7]   王金星, 马博, 王震, 等 基于改进Mask R-CNN的苹果园害虫识别方法[J]. 农业机械学报, 2023, 54 (6): 253- 263
WANG Jinxing, MA Bo, WANG Zhen, et al Pest identification method in apple orchard based on improved Mask R-CNN[J]. Transactions of the Chinese Society of Agricultural Machinery, 2023, 54 (6): 253- 263
doi: 10.6041/j.issn.1000-1298.2023.06.026
[8]   HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN [C]// Proceedings of the IEEE International Conference on Computer Vision . Venice: IEEE, 2017: 2961−2969.
[9]   WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision . Munich: Springer, 2018: 3−19.
[10]   XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 1492−1500.
[11]   王卫星, 刘泽乾, 高鹏, 等 基于改进YOLO v4的荔枝病虫害检测模型[J]. 农业机械学报, 2023, 54 (5): 227- 235
WANG Weixing, LIU Zeqian, GAO Peng, et al Detection of litchi diseases and insect pests based on improved YOLO v4 model[J]. Transactions of the Chinese Society of Agricultural Machinery, 2023, 54 (5): 227- 235
doi: 10.6041/j.issn.1000-1298.2023.05.023
[12]   HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 1580−1589.
[13]   苏虹. 枸杞病虫害识别方法的研究与设计[D]. 银川: 宁夏大学, 2019: 1−2.
SU Hong. Research and design of recognition algorithm for wolfberry pests and diseases [D]. Yinchuan: Ningxia University, 2019: 1−2.
[14]   REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 779−788.
[15]   JOCHER G. YOLOv5: minor version 6.0 [EB/OL]. (2021−10−12) [2023−06−22]. https://github.com/ultralytics/yolov5/releases/tag/v6.0.
[16]   LI J, XIA X, LI W, et al. Next-ViT: next generation vision transformer for efficient deployment in realistic industrial scenarios [EB/OL]. (2022−08−16)[2023−06−22]. https://arxiv.org/pdf/2207.05501.
[17]   CHEN J, KAO S H, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver: IEEE, 2023: 12021−12031.
[18]   沈守娟, 郑广浩, 彭译萱, 等 基于YOLOv3算法的教室学生检测与人数统计方法[J]. 软件导刊, 2020, 19 (9): 78- 83
SHEN Shoujuan, PENG Yixuan, ZHENG Guanghao, et al Crowd detection and statistical methods based on YOLOv3 algorithm in classroom scenes[J]. Software Guide, 2020, 19 (9): 78- 83
[19]   DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale [EB/OL]. (2020−06−03)[2023−06−22]. https://arxiv.org/pdf/2010.11929.
[20]   YU W, LUO M, ZHOU P, et al. MetaFormer is actually what you need for vision [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans: IEEE, 2022: 10819−10829.
[21]   LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 2117−2125.
[22]   SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization [C]// Proceedings of the IEEE International Conference Computer Vision . Venice: IEEE, 2017.
[23]   LI Y, HU J, WEN Y, et al. Rethinking vision transformers for MobileNet size and speed [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Paris: IEEE, 2023: 16889−16900.
[24]   邵明月, 张建华, 冯全, 等 深度学习在植物叶部病害检测与识别的研究进展[J]. 智慧农业(中英文), 2022, 4 (1): 29- 46
SHAO Mingyue, ZHANG Jianhua, FENG Quan, et al Research progress of deep learning in detection and recognition of plant leaf diseases[J]. Smart Agriculture, 2022, 4 (1): 29- 46
[25]   REDMON J, FARHADI A. YOLOv3: an incremental improvement [EB/OL]. (2018−04−08)[2023−06−22]. https://arxiv.org/pdf/1804.02767.
[26]   WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies set new state-of-the-art for real-time object detectors [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Patterns Recognized . Vancouver: IEEE, 2023: 7464−7475.
[27]   GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021 [EB/OL]. (2021−08−06)[2023−06−22]. https://arxiv.org/pdf/2107.08430
[28]   CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers [C]// Proceedings of European Conference on Computer Vision . Glasgow: Springer, 2020: 213−229.
[29]   TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 10781−10790.
[1] Fan LI,Jie YANG,Zhicheng FENG,Zhichao CHEN,Yunxiao FU. Pantograph-catenary contact point detection method based on image recognition[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1801-1810.
[2] Qingdong RAN,Lixin ZHENG. Defect detection method of lithium battery electrode based on improved YOLOv5[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1811-1821.
[3] Lingjia ZHANG,Xinlei ZHOU,Yueping XU,Yenming CHIANG. Analysis of inundation from social media based on integrated YOLOv5 and Mask-RCNN model[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1822-1831.
[4] Li XIAO,Zhigang CAO,Haoran LU,Zhijian HUANG,Yuanqiang CAI. Elastic metamaterial design based on deep learning and gradient optimization[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1892-1901.
[5] Shuhan WU,Dan WANG,Yuanfang CHEN,Ziyu JIA,Yueqi ZHANG,Meng XU. Attention-fused filter bank dual-view graph convolution motor imagery EEG classification[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1326-1335.
[6] Linrui LI,Dongsheng WANG,Hongjie FAN. Fact-based similar case retrieval methods based on statutory knowledge[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1357-1365.
[7] Xianwei MA,Chaohui FAN,Weizhi NIE,Dong LI,Yiqun ZHU. Robust fault diagnosis method for failure sensors[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1488-1497.
[8] Juan SONG,Longxi HE,Huiping LONG. Deep learning-based algorithm for multi defect detection in tunnel lining[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(6): 1161-1173.
[9] Cuiting WEI,Weijian ZHAO,Bochao SUN,Yunyi LIU. Intelligent rebar inspection based on improved Mask R-CNN and stereo vision[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(5): 1009-1019.
[10] Bo ZHONG,Pengfei WANG,Yiqiao WANG,Xiaoling WANG. Survey of deep learning based EEG data analysis technology[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(5): 879-890.
[11] Hai HUAN,Yu SHENG,Chenxi GU. Global guidance multi-feature fusion network based on remote sensing image road extraction[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 696-707.
[12] Xianglong LUO,Yafei WANG,Yanbo WANG,Lixin WANG. Structural deformation prediction of monitoring data based on bi-directional gate board learning system[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 729-736.
[13] Mingjun SONG,Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU. Light-weight algorithm for real-time robotic grasp detection[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 599-610.
[14] Qingjie QIAN,Junhe YU,Hongfei ZHAN,Rui WANG,Jian HU. Dimension prediction method of injection molded parts based on multi-feature fusion of DL-BiGRU[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 646-654.
[15] Xinhua YAO,Tao YU,Senwen FENG,Zijian MA,Congcong LUAN,Hongyao SHEN. Recognition method of parts machining features based on graph neural network[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 349-359.