Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2026, Vol. 60 Issue (3): 604-613    DOI: 10.3785/j.issn.1008-973X.2026.03.016
    
Lightweight improved RT-DETR algorithm for grape leaf disease detection
Hui LIU(),Fangxiu WANG*(),Yi WANG,Zibo HUANG,Chen SU
School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430048, China
Download: HTML     PDF(6104KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A lightweight detector SCGI-DETR was proposed based on an enhanced RT-DETR in order to address challenges in grape leaf disease detection—complex background, missed detection of small target, and resource-constrained deployment. The efficient StarNet backbone was employed to reduce parameter count and computational cost, enabling lightweight deployment. A feature pyramid CGSFR-FPN was designed. Spatial feature reconstruction was combined with multi-scale feature fusion in order to strengthen global context modeling and improve localization of multi-scale lesions in cluttered scenes. The Inner-PowerIoU v2 loss was constructed, which integrated global convergence acceleration and local region alignment in order to speed up bounding-box regression and enhance small-object detection performance. SCGI-DETR attained 91.6% precision, 89.8% recall and 93.4% mAP@0.5 on a grape leaf disease dataset, which improved 2.6, 2.4 and 2.3 percentage points over the baseline, and reduced parameters and computation by 46.2% and 64%, respectively. Results demonstrate that the improved algorithm achieves lightweight implementation while delivering superior detection performance, meeting deployment requirements for mobile and embedded devices.



Key wordsgrape leaf disease      RT-DETR      StarNet      feature pyramid      lightweight network     
Received: 25 May 2025      Published: 04 February 2026
CLC:  TP 391  
Fund:  湖北省高校优秀中青年科技创新团队资助项目(T2021009);湖北省教育厅科技计划资助项目(D20211604).
Corresponding Authors: Fangxiu WANG     E-mail: 2283298757@qq.com;wfx323@whpu.edu.cn
Cite this article:

Hui LIU,Fangxiu WANG,Yi WANG,Zibo HUANG,Chen SU. Lightweight improved RT-DETR algorithm for grape leaf disease detection. Journal of ZheJiang University (Engineering Science), 2026, 60(3): 604-613.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2026.03.016     OR     https://www.zjujournals.com/eng/Y2026/V60/I3/604


轻量级改进RT-DETR的葡萄叶片病害检测算法

针对葡萄叶片病害检测任务中存在的复杂背景干扰、小目标漏检及模型部署资源受限等问题,提出基于改进RT-DETR的轻量化检测算法SCGI-DETR. 引入高效轻量级的StarNet架构作为特征提取网络,减少模型的参数量和计算量,实现模型的轻量化. 设计CGSFR-FPN特征金字塔网络,通过空间特征重建和多尺度特征融合策略,增强模型对全局上下文信息的感知能力,提升复杂背景下多尺度病斑的定位精度. 构建Inner-PowerIoU v2损失函数,利用全局收敛加速与局部区域对齐机制,加速边界框回归,提高小目标检测性能. 实验结果表明,SCGI-DETR在葡萄叶片病害数据集上的精确率、召回率和mAP@0.5分别为91.6%、89.8%和93.4%,较原模型分别提升了2.6%、2.4%和2.3%,参数量与计算量分别减少了46.2%和64%. 该结果表明,改进算法在实现轻量化的同时,具备更优的检测性能,满足移动端和嵌入式设备的部署需求.


关键词: 葡萄叶片病害,  RT-DETR,  StarNet,  特征金字塔,  轻量化网络 
Fig.1 Sample picture of grape leaf disease
叶片类别$n$
训练集验证集测试集图像数量
黑麻疹病1 6584702482 376
黑腐病1 5664892082 263
叶枯病1 4664212292 116
健康1 7874722412 500
总数6 47718529269 255
Tab.1 Number of images in grape disease dataset
Fig.2 Structure diagram of SCGI-DETR network
Fig.3 Structure diagram of StarNet network
Fig.4 Structure of RCM module
Fig.5 Structure of PCE module
Fig.6 Structure diagram of Inner-PowerIoU v2
StarNetCGSFR-FPNInner-PowerIoU v2P/%R/%mAP@0.5/%${N_{\text{P}}}$/106FLOPs/109
89.087.491.119.957.0
90.388.492.212.031.8
90.688.992.419.248.2
90.388.491.819.957.0
90.789.592.710.720.5
90.989.092.712.031.8
89.988.691.819.248.2
91.689.893.410.720.5
Tab.2 Comparison result of ablation experiment
模型方法P/%R/%mAP@0.5/%${N_{\text{P}}}$/106$v$/(帧?s?1)FLOPs/109
RT-DETRResNet-18(原算法)89.087.491.119.967.257.0
RT-DETRFasterNet89.787.491.010.966.328.5
RT-DETREfficientViT88.987.590.810.743.527.2
RT-DETRRepViT89.887.191.213.357.636.3
RT-DETRMobileNetV489.787.891.711.370.739.5
RT-DETRStarNet90.388.492.212.072.231.8
Tab.3 Comparison result of different lightweight backbone network
损失函数P/%R/%mAP@0.5/%
GIoU(基线)89.087.491.1
EIoU88.887.290.8
SIoU89.587.491.0
PIoU90.287.391.3
PIoU v290.088.091.7
Inner-IoU90.287.791.4
Wise-IoU89.687.891.5
ShapeIoU90.087.891.3
Inner-PowerIoU v290.388.491.8
Tab.4 Effect of different loss function on SCGI-DETR
$s$P/%R/%mAP@0.5/%
0.589.688.091.5
0.689.388.391.5
0.789.787.891.4
0.890.088.191.7
0.990.288.291.7
1.090.388.491.8
1.190.388.091.7
1.289.788.291.7
1.389.988.091.6
1.489.588.191.6
1.589.787.791.4
Tab.5 Effect of different scale factor on SCGI-DETR
模型P/%R/%mAP@0.5/%${N_{\text{P}}}$/106$v$/(帧?s?1)FLOPs/109
SSD(2016)80.776.783.126.232.562.6
Faster R-CNN(2016)81.675.382.841.312.9212.8
EfficientDet(2020)83.778.684.533.413.4260.7
YOLOv7[29](2022)89.888.792.636.482.8103.2
YOLOv8s(2023)91.089.593.011.1122.328.4
YOLOv9s[30](2024)89.488.192.19.698.738.7
YOLOv10s[31](2024)90.687.792.58.1130.224.5
YOLOv11s[32](2024)91.288.693.19.4124.221.3
Deformable DETR(2020)89.188.091.039.910.8179.6
DINO(2022)89.487.992.346.77.3279.2
MS-DETR(2023)89.688.592.353.530.59117.1
RT-DETR(2023)89.087.491.119.967.257.0
SCGI-DETR(本文方法)91.689.893.410.766.720.5
Tab.6 Comparison experimental result of different models
Fig.7 Comparison of small target detection effect
Fig.8 Comparison of detection effect under complex background
Fig.9 Visualization result of SCGI-DETR model’s ability to focus on key area of blade
[1]   岳喜申. 基于改进YOLOv5s的葡萄叶片病害识别方法研究 [D]. 阿拉尔: 塔里木大学, 2024.
YUE Xishen. A study on grape leaf disease identification method based on improved YOLOv5s [D]. Alar: Tarim University, 2024.
[2]   REN S, HE K, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149
doi: 10.1109/TPAMI.2016.2577031
[3]   LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]//14th European Conference on Computer Vision. Cham: Springer, 2016: 21–37.
[4]   REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779–788.
[5]   姜晟, 曹亚芃, 刘梓伊, 等 基于改进Faster RCNN的茶叶叶部病害识别[J]. 华中农业大学学报, 2024, 43 (5): 41- 50
JIANG Sheng, CAO Yapeng, LIU Ziyi, et al Recognition of tea leaf disease based on improved Faster RCNN[J]. Journal of Huazhong Agricultural University, 2024, 43 (5): 41- 50
[6]   PAN P, SHAO M, HE P, et al Lightweight cotton diseases real-time detection model for resource-constrained devices in natural environments[J]. Frontiers in Plant Science, 2024, 15: 1383863
doi: 10.3389/fpls.2024.1383863
[7]   VASWANI A, SHAZEER N, PARMAR N, et al Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 5998- 6008
[8]   ZHAO Y, LV W, XU S, et al. DETRs beat YOLOs on real-time object detection [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 16965–16974.
[9]   HUANGFU Y, HUANG Z, YANG X, et al HHS-RT-DETR: a method for the detection of citrus greening disease[J]. Agronomy, 2024, 14 (12): 2900
doi: 10.3390/agronomy14122900
[10]   YANG H, DENG X, SHEN H, et al Disease detection and identification of rice leaf based on improved detection transformer[J]. Agriculture, 2023, 13 (7): 1361
doi: 10.3390/agriculture13071361
[11]   FU Z, YIN L, CUI C, et al A lightweight MHDI-DETR model for detecting grape leaf diseases[J]. Frontiers in Plant Science, 2024, 15: 1499911
doi: 10.3389/fpls.2024.1499911
[12]   MA X, DAI X, BAI Y, et al. Rewrite the stars [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 5694–5703.
[13]   REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2020: 658–666.
[14]   NI Z, CHEN X, ZHAI Y, et al. Context-guided spatial feature reconstruction for efficient semantic segmentation [C]//European Conference on Computer Vision. Cham: Springer, 2024: 239–255.
[15]   LIU C, WANG K, LI Q, et al Powerful-IoU: more straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. Neural Networks, 2024, 170: 276- 284
doi: 10.1016/j.neunet.2023.11.041
[16]   ZHANG H, XU C, ZHANG S. Inner-IoU: more effective intersection over union loss with auxiliary bounding box [EB/OL]. (2023-11-06)[2025-03-16]. https://arxiv. org/pdf/2311.02877.
[17]   CHEN J, KAO S H, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 12021–12031.
[18]   LIU X, PENG H, ZHENG N, et al. EfficientViT: memory efficient vision transformer with cascaded group attention [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 14420–14430.
[19]   WANG A, CHEN H, LIN Z, et al. Rep ViT: revisiting mobile CNN from ViT perspective [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 15909–15920.
[20]   QIN D, LEICHNER C, DELAKIS M, et al. MobileNetV4: universal models for the mobile ecosystem [C]// European Conference on Computer Vision. Cham: Springer, 2024: 78–96.
[21]   ZHANG Y F, REN W, ZHANG Z, et al Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146- 157
doi: 10.1016/j.neucom.2022.07.042
[22]   GEVORGYAN Z. SIoU loss: more powerful learning for bounding box regression [EB/OL]. (2022-05-25)[2025-03-16]. https://arxiv.org/abs/2205.12740.
[23]   TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism [EB/OL]. (2023-01-24)[2025-03-16]. https://arxiv.org/ abs/2301.10051.
[24]   ZHANG H, ZHANG S. Shape-IoU: more accurate metric considering bounding box shape and scale [EB/OL]. (2023-12-29)[2025-03-16]. https://arxiv.org/abs/ 2312.17663.
[25]   TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10778–10787.
[26]   ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection [EB/OL]. (2020-10-08)[2025-03-16]. https://arxiv.org/ abs/2010.04159.
[27]   ZHANG H, LI F, LIU S, et al. DINO: DETR with improved denoising anchor boxes for end-to-end object detection [EB/OL]. (2022-03-07)[2025-03-16]. https://arxiv.org/abs/2203.03605.
[28]   ZHAO C, SUN Y, WANG W, et al. MS-DETR: efficient DETR training with mixed supervision [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 17027–17036.
[29]   WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 7464–7475.
[30]   WANG C Y, YEH I H, MARK LIAO H Y. YOLOv9: learning what you want to learn using programmable gradient information [C]//European Conference on Computer Vision. Cham: Springer, 2024: 1–21.
[31]   WANG A, CHEN H, LIU L, et al YOLOv10: real-time end-to-end object detection[J]. Advances in Neural Information Processing Systems, 2024, 37: 107984- 108011
[32]   KHANAM R, HUSSAIN M. YOLOv11: an overview of the key architectural enhancements [EB/OL]. (2024-10-23)[2025-03-16]. https://arxiv.org/abs/2410.17725.
[1] Gengliang LIANG,Shuguang HAN. Denim fabric defect detection algorithm based on improved RT-DETR[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1169-1178.
[2] Xianglei YIN,Shaopeng QU,Yongfang XIE,Ni SU. Occluded bird nest detection based on asymptotic feature fusion and multi-scale dilated attention[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 535-545.
[3] Guangming WANG,Zhengyao BAI,Shuai SONG,Yue’e XU. Lightweight multimodal data fusion network for auxiliary diagnosis of Alzheimer’s disease[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 39-48.
[4] Anjing WANG,Julong YUAN,Yongjian ZHU,Cong CHEN,Jinjin WU. Drum roller surface defect detection algorithm based on improved YOLOv8s[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 370-380.
[5] Hao-ran GUO,Ji-chang GUO,Yu-dong WANG. Lightweight semantic segmentation network for underwater image[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1278-1286.
[6] Shu-qin YANG,Yu-hao MA,Ming-yu FANG,Wei-xing QIAN,Jie-xuan CAI,Tong LIU. Lane detection method in complex environments based on instance segmentation[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 809-815, 832.
[7] Yu XIE,Zi-qun BAO,Na ZHANG,Biao WU,Xiao-mei TU,Xiao-an BAO. Object detection algorithm based on feature enhancement and deep fusion[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2403-2415.
[8] Hong-zhao DONG,Hao-jie FANG,Nan ZHANG. Multi-scale object detection algorithm for recycled objects based on rotating block positioning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 16-25.
[9] Li-feng XU,Hai-fan HUANG,Wei-long DING,Yu-lei FAN. Detection of small fruit target based on improved DenseNet[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 377-385.
[10] Zhi-jie LIN,Zhuang LUO,Lei ZHAO,Dong-ming LU. Multi-scale convolution target detection algorithm with feature pyramid[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 533-540.