|
|
|
| Lightweight improved RT-DETR algorithm for grape leaf disease detection |
Hui LIU( ),Fangxiu WANG*( ),Yi WANG,Zibo HUANG,Chen SU |
| School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430048, China |
|
|
|
Abstract A lightweight detector SCGI-DETR was proposed based on an enhanced RT-DETR in order to address challenges in grape leaf disease detection—complex background, missed detection of small target, and resource-constrained deployment. The efficient StarNet backbone was employed to reduce parameter count and computational cost, enabling lightweight deployment. A feature pyramid CGSFR-FPN was designed. Spatial feature reconstruction was combined with multi-scale feature fusion in order to strengthen global context modeling and improve localization of multi-scale lesions in cluttered scenes. The Inner-PowerIoU v2 loss was constructed, which integrated global convergence acceleration and local region alignment in order to speed up bounding-box regression and enhance small-object detection performance. SCGI-DETR attained 91.6% precision, 89.8% recall and 93.4% mAP@0.5 on a grape leaf disease dataset, which improved 2.6, 2.4 and 2.3 percentage points over the baseline, and reduced parameters and computation by 46.2% and 64%, respectively. Results demonstrate that the improved algorithm achieves lightweight implementation while delivering superior detection performance, meeting deployment requirements for mobile and embedded devices.
|
|
Received: 25 May 2025
Published: 04 February 2026
|
|
|
| Fund: 湖北省高校优秀中青年科技创新团队资助项目(T2021009);湖北省教育厅科技计划资助项目(D20211604). |
|
Corresponding Authors:
Fangxiu WANG
E-mail: 2283298757@qq.com;wfx323@whpu.edu.cn
|
轻量级改进RT-DETR的葡萄叶片病害检测算法
针对葡萄叶片病害检测任务中存在的复杂背景干扰、小目标漏检及模型部署资源受限等问题,提出基于改进RT-DETR的轻量化检测算法SCGI-DETR. 引入高效轻量级的StarNet架构作为特征提取网络,减少模型的参数量和计算量,实现模型的轻量化. 设计CGSFR-FPN特征金字塔网络,通过空间特征重建和多尺度特征融合策略,增强模型对全局上下文信息的感知能力,提升复杂背景下多尺度病斑的定位精度. 构建Inner-PowerIoU v2损失函数,利用全局收敛加速与局部区域对齐机制,加速边界框回归,提高小目标检测性能. 实验结果表明,SCGI-DETR在葡萄叶片病害数据集上的精确率、召回率和mAP@0.5分别为91.6%、89.8%和93.4%,较原模型分别提升了2.6%、2.4%和2.3%,参数量与计算量分别减少了46.2%和64%. 该结果表明,改进算法在实现轻量化的同时,具备更优的检测性能,满足移动端和嵌入式设备的部署需求.
关键词:
葡萄叶片病害,
RT-DETR,
StarNet,
特征金字塔,
轻量化网络
|
|
| [1] |
岳喜申. 基于改进YOLOv5s的葡萄叶片病害识别方法研究 [D]. 阿拉尔: 塔里木大学, 2024. YUE Xishen. A study on grape leaf disease identification method based on improved YOLOv5s [D]. Alar: Tarim University, 2024.
|
|
|
| [2] |
REN S, HE K, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149
doi: 10.1109/TPAMI.2016.2577031
|
|
|
| [3] |
LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]//14th European Conference on Computer Vision. Cham: Springer, 2016: 21–37.
|
|
|
| [4] |
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779–788.
|
|
|
| [5] |
姜晟, 曹亚芃, 刘梓伊, 等 基于改进Faster RCNN的茶叶叶部病害识别[J]. 华中农业大学学报, 2024, 43 (5): 41- 50 JIANG Sheng, CAO Yapeng, LIU Ziyi, et al Recognition of tea leaf disease based on improved Faster RCNN[J]. Journal of Huazhong Agricultural University, 2024, 43 (5): 41- 50
|
|
|
| [6] |
PAN P, SHAO M, HE P, et al Lightweight cotton diseases real-time detection model for resource-constrained devices in natural environments[J]. Frontiers in Plant Science, 2024, 15: 1383863
doi: 10.3389/fpls.2024.1383863
|
|
|
| [7] |
VASWANI A, SHAZEER N, PARMAR N, et al Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 5998- 6008
|
|
|
| [8] |
ZHAO Y, LV W, XU S, et al. DETRs beat YOLOs on real-time object detection [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 16965–16974.
|
|
|
| [9] |
HUANGFU Y, HUANG Z, YANG X, et al HHS-RT-DETR: a method for the detection of citrus greening disease[J]. Agronomy, 2024, 14 (12): 2900
doi: 10.3390/agronomy14122900
|
|
|
| [10] |
YANG H, DENG X, SHEN H, et al Disease detection and identification of rice leaf based on improved detection transformer[J]. Agriculture, 2023, 13 (7): 1361
doi: 10.3390/agriculture13071361
|
|
|
| [11] |
FU Z, YIN L, CUI C, et al A lightweight MHDI-DETR model for detecting grape leaf diseases[J]. Frontiers in Plant Science, 2024, 15: 1499911
doi: 10.3389/fpls.2024.1499911
|
|
|
| [12] |
MA X, DAI X, BAI Y, et al. Rewrite the stars [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 5694–5703.
|
|
|
| [13] |
REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2020: 658–666.
|
|
|
| [14] |
NI Z, CHEN X, ZHAI Y, et al. Context-guided spatial feature reconstruction for efficient semantic segmentation [C]//European Conference on Computer Vision. Cham: Springer, 2024: 239–255.
|
|
|
| [15] |
LIU C, WANG K, LI Q, et al Powerful-IoU: more straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. Neural Networks, 2024, 170: 276- 284
doi: 10.1016/j.neunet.2023.11.041
|
|
|
| [16] |
ZHANG H, XU C, ZHANG S. Inner-IoU: more effective intersection over union loss with auxiliary bounding box [EB/OL]. (2023-11-06)[2025-03-16]. https://arxiv. org/pdf/2311.02877.
|
|
|
| [17] |
CHEN J, KAO S H, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 12021–12031.
|
|
|
| [18] |
LIU X, PENG H, ZHENG N, et al. EfficientViT: memory efficient vision transformer with cascaded group attention [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 14420–14430.
|
|
|
| [19] |
WANG A, CHEN H, LIN Z, et al. Rep ViT: revisiting mobile CNN from ViT perspective [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 15909–15920.
|
|
|
| [20] |
QIN D, LEICHNER C, DELAKIS M, et al. MobileNetV4: universal models for the mobile ecosystem [C]// European Conference on Computer Vision. Cham: Springer, 2024: 78–96.
|
|
|
| [21] |
ZHANG Y F, REN W, ZHANG Z, et al Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146- 157
doi: 10.1016/j.neucom.2022.07.042
|
|
|
| [22] |
GEVORGYAN Z. SIoU loss: more powerful learning for bounding box regression [EB/OL]. (2022-05-25)[2025-03-16]. https://arxiv.org/abs/2205.12740.
|
|
|
| [23] |
TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism [EB/OL]. (2023-01-24)[2025-03-16]. https://arxiv.org/ abs/2301.10051.
|
|
|
| [24] |
ZHANG H, ZHANG S. Shape-IoU: more accurate metric considering bounding box shape and scale [EB/OL]. (2023-12-29)[2025-03-16]. https://arxiv.org/abs/ 2312.17663.
|
|
|
| [25] |
TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10778–10787.
|
|
|
| [26] |
ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection [EB/OL]. (2020-10-08)[2025-03-16]. https://arxiv.org/ abs/2010.04159.
|
|
|
| [27] |
ZHANG H, LI F, LIU S, et al. DINO: DETR with improved denoising anchor boxes for end-to-end object detection [EB/OL]. (2022-03-07)[2025-03-16]. https://arxiv.org/abs/2203.03605.
|
|
|
| [28] |
ZHAO C, SUN Y, WANG W, et al. MS-DETR: efficient DETR training with mixed supervision [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 17027–17036.
|
|
|
| [29] |
WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 7464–7475.
|
|
|
| [30] |
WANG C Y, YEH I H, MARK LIAO H Y. YOLOv9: learning what you want to learn using programmable gradient information [C]//European Conference on Computer Vision. Cham: Springer, 2024: 1–21.
|
|
|
| [31] |
WANG A, CHEN H, LIU L, et al YOLOv10: real-time end-to-end object detection[J]. Advances in Neural Information Processing Systems, 2024, 37: 107984- 108011
|
|
|
| [32] |
KHANAM R, HUSSAIN M. YOLOv11: an overview of the key architectural enhancements [EB/OL]. (2024-10-23)[2025-03-16]. https://arxiv.org/abs/2410.17725.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|