Wolfberry pest detection based on improved YOLOv5

doi:10.3785/j.issn.1008-973X.2024.10.002

Journal of ZheJiang University (Engineering Science)

2024, Vol. 58

Issue (10): 1992-2000 DOI: 10.3785/j.issn.1008-973X.2024.10.002

Wolfberry pest detection based on improved YOLOv5

Dingjian DU1(

),Zunhai GAO1,*(

),Zhuo CHEN2

1. School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430048, China
2. School of Management, Wuhan Polytechnic University, Wuhan 430048, China

Download:

HTML

PDF(3603KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A model based on improved YOLOv5m was proposed for wolfberry pest detection in a complex environment. The next generation vision transformer (Next-ViT) was used as the backbone network to improve the feature extraction ability of the model, and the key target features were given more attention by the model. An adaptive fusion context enhancement module was added to the neck to enhance the model’s ability to understand and process contextual information, and the precision of the model for the small object (aphids) detection was improved. The C3 module in the neck network was replaced by using the C3_Faster module to reduce the model footprint and further improve the model precision. Experimental results showed that the proposed model achieved a precision of 97.0% and a recall of 92.1%. The mean average precision (mAP50) was 94.7%, which was 1.9 percentage points higher than that of the YOLOv5m, and the average precision of aphid detection was improved by 9.4 percentage points. The mAP50 of different models were compared and the proposed was 1.6, 1.6, 2.8, 3.5, and 1.0 percentage points higher than the mainstream models YOLOv7, YOLOX, DETR, EfficientDet-D1, and Cascade R-CNN, respectively. The proposed model improves the detection performance while maintaining a reasonable model footprint.

Key words： wolfberry pest deep learning small object detection YOLOv5 next generation vision transformer (Next-ViT)

Received: 27 September 2023 Published: 27 September 2024

CLC:	TP 391.4
	S 435.112

Fund: 湖北省社会科学基金资助项目（21ZD072）.

Corresponding Authors: Zunhai GAO E-mail: ddj1670687939@163.com;haigao007@whpu.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Dingjian DU
	Zunhai GAO
	Zhuo CHEN

Cite this article:

Dingjian DU,Zunhai GAO,Zhuo CHEN. Wolfberry pest detection based on improved YOLOv5. Journal of ZheJiang University (Engineering Science), 2024, 58(10): 1992-2000.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2024.10.002 OR https://www.zjujournals.com/eng/Y2024/V58/I10/1992

基于改进YOLOv5的枸杞虫害检测

为了检测复杂环境下枸杞的虫害情况，提出基于改进YOLOv5m的模型. 以下一代视觉转换器（Next-ViT）作为骨干网络，提高模型的特征提取能力，使模型更加关注关键目标特征. 在模型颈部增加自适应融合的上下文增强模块，增强模型对上下文信息的理解与处理能力，提高模型对小目标（蚜虫）的检测精度. 将颈部网络中的C3模块替换为C3_Faster模块，减少模型占用量并进一步提高模型检测精度. 实验结果表明，所提模型的准确率和召回率分别为97.0%、92.1%，平均精度均值为94.7%；相比于YOLOv5m，所提模型的平均精度均值提高了1.9个百分点，蚜虫的检测平均精度提高了9.4个百分点. 对比不同模型的平均精度均值，所提模型比主流模型YOLOv7、YOLOX、DETR、EfficientDet-D1、Cascade R-CNN分别高1.6、1.6、2.8、3.5、1.0个百分点. 所提模型在提高检测性能的同时，模型占用量也保持在合理范围内.

关键词： 枸杞虫害, 深度学习, 小目标检测, YOLOv5, 下一代视觉转换器（Next-ViT）

Fig.1 Example diagram of wolfberry pest

Tab.1 Image types in wolfberry pest dataset

Fig.2 Network structure of YOLOv5m

Fig.3 Overall architecture diagram of next generation vision transformer

Fig.4 Structure diagram of context augmentation module

Fig.5 Location map of context augmentation module

Fig.6 Architecture diagram of adaptive fusion

Fig.7 Structure diagram of FaterNet block and partial convolution

Fig.8 Overall structure diagram of NCF-YOLO wolfberry pest detection model

Fig.9 Comparison of heat maps of pest characteristics before and after model improvement

Tab.2 Comparison of detection performance before and after model improvement

Fig.10 Comparison of detection effect of aphids before and after model improvement

Fig.11 Mistake detection for negative mudbugs chart

Tab.3 Comparison of detection performance for different models


[1]	陈磊, 刘立波, 王晓丽. 2020年宁夏枸杞虫害图文跨模态检索数据集[J]. 中国科学数据(中英文网络版), 2022, 7(3): 149−156. CHEN Lei, LIU Libo, WANG Xiaoli. A dataset of image-text cross-modal retrieval of Lycium barbarum pests in Ningxia in 2020 [J]. China Scientific Data . 2022, 7(3): 149−156.

[2]	王云露. 基于深度迁移学习的苹果病害识别方法研究[D]. 泰安: 山东农业大学, 2022: 2−14. WANG Yunlu. Apple disease identification method based on deep transfer learning [D]. Tai’an: Shandong Agricultural University, 2022: 2−14.

[3]	胡林龙. 基于图像处理的甘蓝型油菜的虫害程度与识别的研究[D]. 武汉: 武汉轻工大学, 2020: 10−45. HU Linlong. Study on pest degree and recognition of brassica napus based on image processing [D]. Wuhan: Wuhan Polytechnic University, 2020: 10−45.

[4]	EBRAHIMI M A, KHOSHTAGHAZA M H, MINAEI S, et al Vision-based pest detection based on SVM classification method[J]. Computers and Electronics in Agriculture, 2017, 137: 52- 58 doi: 10.1016/j.compag.2017.03.016

[5]	WEN C, CHEN H, MA Z, et al Pest-YOLO: a model for large-scale multi-class dense and tiny pest detection and counting[J]. Frontiers in Plant Science, 2022, 13: 973985 doi: 10.3389/fpls.2022.973985

[6]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection [EB/OL]. (2020−04−23)[2023−06−22]. https://arxiv.org/pdf/2004.10934.

[7]	王金星, 马博, 王震, 等基于改进Mask R-CNN的苹果园害虫识别方法[J]. 农业机械学报, 2023, 54 (6): 253- 263 WANG Jinxing, MA Bo, WANG Zhen, et al Pest identification method in apple orchard based on improved Mask R-CNN[J]. Transactions of the Chinese Society of Agricultural Machinery, 2023, 54 (6): 253- 263 doi: 10.6041/j.issn.1000-1298.2023.06.026

[8]	HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN [C]// Proceedings of the IEEE International Conference on Computer Vision . Venice: IEEE, 2017: 2961−2969.

[9]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision . Munich: Springer, 2018: 3−19.

[10]	XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 1492−1500.

[11]	王卫星, 刘泽乾, 高鹏, 等基于改进YOLO v4的荔枝病虫害检测模型[J]. 农业机械学报, 2023, 54 (5): 227- 235 WANG Weixing, LIU Zeqian, GAO Peng, et al Detection of litchi diseases and insect pests based on improved YOLO v4 model[J]. Transactions of the Chinese Society of Agricultural Machinery, 2023, 54 (5): 227- 235 doi: 10.6041/j.issn.1000-1298.2023.05.023

[12]	HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 1580−1589.

[13]	苏虹. 枸杞病虫害识别方法的研究与设计[D]. 银川: 宁夏大学, 2019: 1−2. SU Hong. Research and design of recognition algorithm for wolfberry pests and diseases [D]. Yinchuan: Ningxia University, 2019: 1−2.

[14]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 779−788.

[15]	JOCHER G. YOLOv5: minor version 6.0 [EB/OL]. (2021−10−12) [2023−06−22]. https://github.com/ultralytics/yolov5/releases/tag/v6.0.

[16]	LI J, XIA X, LI W, et al. Next-ViT: next generation vision transformer for efficient deployment in realistic industrial scenarios [EB/OL]. (2022−08−16)[2023−06−22]. https://arxiv.org/pdf/2207.05501.

[17]	CHEN J, KAO S H, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver: IEEE, 2023: 12021−12031.

[18]	沈守娟, 郑广浩, 彭译萱, 等基于YOLOv3算法的教室学生检测与人数统计方法[J]. 软件导刊, 2020, 19 (9): 78- 83 SHEN Shoujuan, PENG Yixuan, ZHENG Guanghao, et al Crowd detection and statistical methods based on YOLOv3 algorithm in classroom scenes[J]. Software Guide, 2020, 19 (9): 78- 83

[19]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale [EB/OL]. (2020−06−03)[2023−06−22]. https://arxiv.org/pdf/2010.11929.

[20]	YU W, LUO M, ZHOU P, et al. MetaFormer is actually what you need for vision [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans: IEEE, 2022: 10819−10829.

[21]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 2117−2125.

[22]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization [C]// Proceedings of the IEEE International Conference Computer Vision . Venice: IEEE, 2017.

[23]	LI Y, HU J, WEN Y, et al. Rethinking vision transformers for MobileNet size and speed [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Paris: IEEE, 2023: 16889−16900.

[24]	邵明月, 张建华, 冯全, 等深度学习在植物叶部病害检测与识别的研究进展[J]. 智慧农业(中英文), 2022, 4 (1): 29- 46 SHAO Mingyue, ZHANG Jianhua, FENG Quan, et al Research progress of deep learning in detection and recognition of plant leaf diseases[J]. Smart Agriculture, 2022, 4 (1): 29- 46

[25]	REDMON J, FARHADI A. YOLOv3: an incremental improvement [EB/OL]. (2018−04−08)[2023−06−22]. https://arxiv.org/pdf/1804.02767.

[26]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies set new state-of-the-art for real-time object detectors [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Patterns Recognized . Vancouver: IEEE, 2023: 7464−7475.

[27]	GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021 [EB/OL]. (2021−08−06)[2023−06−22]. https://arxiv.org/pdf/2107.08430

[28]	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers [C]// Proceedings of European Conference on Computer Vision . Glasgow: Springer, 2020: 213−229.

[29]	TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 10781−10790.

[1]	Fan LI,Jie YANG,Zhicheng FENG,Zhichao CHEN,Yunxiao FU. Pantograph-catenary contact point detection method based on image recognition[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1801-1810.

[2]	Qingdong RAN,Lixin ZHENG. Defect detection method of lithium battery electrode based on improved YOLOv5[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1811-1821.

[3]	Lingjia ZHANG,Xinlei ZHOU,Yueping XU,Yenming CHIANG. Analysis of inundation from social media based on integrated YOLOv5 and Mask-RCNN model[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1822-1831.

[4]	Li XIAO,Zhigang CAO,Haoran LU,Zhijian HUANG,Yuanqiang CAI. Elastic metamaterial design based on deep learning and gradient optimization[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1892-1901.

[5]	Shuhan WU,Dan WANG,Yuanfang CHEN,Ziyu JIA,Yueqi ZHANG,Meng XU. Attention-fused filter bank dual-view graph convolution motor imagery EEG classification[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1326-1335.

[6]	Linrui LI,Dongsheng WANG,Hongjie FAN. Fact-based similar case retrieval methods based on statutory knowledge[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1357-1365.

[7]	Xianwei MA,Chaohui FAN,Weizhi NIE,Dong LI,Yiqun ZHU. Robust fault diagnosis method for failure sensors[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1488-1497.

[8]	Juan SONG,Longxi HE,Huiping LONG. Deep learning-based algorithm for multi defect detection in tunnel lining[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(6): 1161-1173.

[9]	Cuiting WEI,Weijian ZHAO,Bochao SUN,Yunyi LIU. Intelligent rebar inspection based on improved Mask R-CNN and stereo vision[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(5): 1009-1019.

[10]	Bo ZHONG,Pengfei WANG,Yiqiao WANG,Xiaoling WANG. Survey of deep learning based EEG data analysis technology[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(5): 879-890.

[11]	Hai HUAN,Yu SHENG,Chenxi GU. Global guidance multi-feature fusion network based on remote sensing image road extraction[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 696-707.

[12]	Xianglong LUO,Yafei WANG,Yanbo WANG,Lixin WANG. Structural deformation prediction of monitoring data based on bi-directional gate board learning system[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 729-736.

[13]	Mingjun SONG,Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU. Light-weight algorithm for real-time robotic grasp detection[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 599-610.

[14]	Qingjie QIAN,Junhe YU,Hongfei ZHAN,Rui WANG,Jian HU. Dimension prediction method of injection molded parts based on multi-feature fusion of DL-BiGRU[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 646-654.

[15]	Xinhua YAO,Tao YU,Senwen FENG,Zijian MA,Congcong LUAN,Hongyao SHEN. Recognition method of parts machining features based on graph neural network[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 349-359.

Viewed

Full text

Abstract

Cited

Shared

Discussed