|
|
|
| Small organism detection in underwater color-cast environments based on improved RT-DETR |
Shaojiang DONG( ),Tao XIAO,Zhenming LV,Haoran XIA,Jiayuan LUO,Shizheng SUN,Xia ZHANG,Chao LIU |
| School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing 400074, China |
|
|
|
Abstract A detection method based on an improved RT-DETR (FES-DETR) was proposed to achieve fast and accurate detection of underwater small organisms, and address the poor detection performance of the existing models in underwater color-cast environments. An efficient multi-scale attention feature extraction (Faster-Rep-EMA) module was designed to optimize the original BasicBlock in the backbone network, thereby improving the feature extraction ability and computational efficiency for weak targets under color-cast interference. The entanglement Transformer block (ETB) was integrated with the attention-based intra-scale feature interaction (AIFI) module in the neck encoding network to achieve the joint optimization of frequency-domain and spatial-domain features, which could enhance the feature representation of color-cast images. A lightweight small object enhancement pyramid (SOEP) module was designed to enhance the detection performance of the model for small targets and reduce the computational redundancy. Experimental results showed that the FES-DETR significantly improved the detection performance. Compared with RT-DETR-r18, the precision and recall were improved by 2.6 and 2.3 percentage points, respectively; the mAP@0.5 and the mAP@0.5:0.95 were improved by 3.2 and 2.1 percentage points, respectively; the number of parameters and computational complexity were decreased by 3.0 M and 8.5 G, respectively; and the FPS was increased to 95.7 frames per second. Compared with mainstream target detection models such as YOLO series, this model showed more superior performance, providing an effective technical approach for the detection of underwater small organisms.
|
|
Received: 15 June 2025
Published: 23 May 2026
|
|
|
| Fund: 重庆市技术创新与应用发展专项重点资助项目(CSTB2024TIAD-KPX0081);重庆市自然科学基金创新发展联合基金资助项目(CSTB2024NSCQ-LZX0024);重庆市教育委员会科学技术研究资助项目(KJZD-K202300711). |
基于改进RT-DETR的水下色偏环境中小型生物检测
为了实现对水下小型生物的快速、准确检测,针对现有模型在水下色偏环境中检测性能差的问题,提出基于改进RT-DETR的检测方法(FES-DETR). 在主干网络中设计高效多尺度注意力特征提取(Faster-Rep-EMA)模块,以优化原有的BasicBlock,提高对色偏干扰下微弱目标的特征提取能力和计算效率. 在颈部编码网络中,将纠缠Transformer块(ETB)与基于注意力的尺度内特征交互(AIFI)模块融合,实现频率域和空间域特征的联合优化,增强色偏图像的特征表达. 设计轻量化小目标增强金字塔(SOEP)模块,增强模型对小目标的检测性能并降低计算冗余. 实验结果表明,FES-DETR显著提高了检测性能,准确率、召回率较RT-DETR-r18分别提升了2.6和2.3个百分点,mAP@0.5和mAP@0.5∶0.95分别提升了3.2和2.1个百分点,参数量和计算量分别下降了3.0 M和8.5 G,FPS提高至95.7帧/s. 与YOLO系列等主流目标检测模型相比,该模型展现出更优越的性能,为水下小型生物检测提供了高效的技术手段.
关键词:
色偏环境,
小型生物,
目标检测,
RT-DETR,
纠缠Transformer
|
|
| [1] |
ELMEZAIN M, SAAD SAOUD L, SULTAN A, et al Advancing underwater vision: a survey of deep learning models for underwater object recognition and tracking[J]. IEEE Access, 2025, 13: 17830- 17867
doi: 10.1109/ACCESS.2025.3534098
|
|
|
| [2] |
SHI P, XU X, NI J, et al Underwater biological detection algorithm based on improved faster-RCNN[J]. Water, 2021, 13 (17): 2420
doi: 10.3390/w13172420
|
|
|
| [3] |
张艳, 孙晶雪, 孙叶美, 等 基于分割注意力与线性变换的轻量化目标检测[J]. 浙江大学学报: 工学版, 2023, 57 (6): 1195- 1204 ZHANG Yan, SUN Jingxue, SUN Yemei, et al Lightweight object detection based on split attention and linear transformation[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (6): 1195- 1204
doi: 10.3785/j.issn.1008-973X.2023.06.015
|
|
|
| [4] |
闵锋, 张雨薇, 刘煜晖, 等 改进YOLOv8的轻量化水下生物检测模型[J]. 计算机工程与应用, 2025, 61 (6): 96- 105 MIN Feng, ZHANG Yuwei, LIU Yuhui, et al Improving lightweight underwater biological detection model of YOLOv8[J]. Computer Engineering and Applications, 2025, 61 (6): 96- 105
doi: 10.3778/j.issn.1002-8331.2408-0411
|
|
|
| [5] |
GUO L, LIU X, YE D, et al Underwater object detection algorithm integrating image enhancement and deformable convolution[J]. Ecological Informatics, 2025, 89: 103185
doi: 10.1016/j.ecoinf.2025.103185
|
|
|
| [6] |
ZHOU H, KONG M, YUAN H, et al Real-time underwater object detection technology for complex underwater environments based on deep learning[J]. Ecological Informatics, 2024, 82: 102680
doi: 10.1016/j.ecoinf.2024.102680
|
|
|
| [7] |
ZHANG W, WANG H, LI H, et al Dual-stream feature pyramid network with task interaction for underwater object detection[J]. Digital Signal Processing, 2025, 163: 105199
doi: 10.1016/j.dsp.2025.105199
|
|
|
| [8] |
CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with Transformers [C]// European Conference on Computer Vision. [S. l. ]: Springer, 2020: 213–229.
|
|
|
| [9] |
ZHU X, SU W, LU L, et al. Deformable DETR: deformable Transformers for end-to-end object detection [EB/OL]. (2020-07-09) [2025-06-01]. https://arxiv.org/abs/2010.04159.
|
|
|
| [10] |
ZHANG H, LI F, LIU S, et al. DINO: DETR with improved denoising anchor boxes for end-to-end object detection [EB/OL]. (2022-03-07) [2025-06-01]. https://arxiv.org/abs/2203.03605.
|
|
|
| [11] |
ZHAO Y, LV W, XU S, et al. DETRs beat YOLOs on real-time object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 16965–16974.
|
|
|
| [12] |
JAMIESON S, HOW J P, GIRDHAR Y. DeepSeeColor: realtime adaptive color correction for autonomous underwater vehicles via deep learning methods [C]// Proceedings of the IEEE International Conference on Robotics and Automation. London: IEEE, 2023: 3095–3101.
|
|
|
| [13] |
吕振鸣, 董绍江, 夏宗佑, 等 基于改进CycleGAN的多失真类型水下图像增强[J]. 浙江大学学报: 工学版, 2025, 59 (6): 1148- 1158 LV Zhenming, DONG Shaojiang, XIA Zongyou, et al Multi-distortion type underwater image enhancement based on improved CycleGAN[J]. Journal of Zhejiang University: Engineering Science, 2025, 59 (6): 1148- 1158
|
|
|
| [14] |
CHEN J, KAO S H, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 12021–12031.
|
|
|
| [15] |
OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning [C]// Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island: IEEE, 2023: 1–5.
|
|
|
| [16] |
BERMAN D, LEVY D, AVIDAN S, et al Underwater single image color restoration using haze-lines and a new quantitative dataset[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43 (8): 2822- 2837
doi: 10.1109/tpami.2020.2977624
|
|
|
| [17] |
SUN Y, XU C, YANG J, et al. Frequency-spatial entanglement learning for camouflaged object detection [C]// European Conference on Computer Vision. Milan: Springer, 2024: 343–360.
|
|
|
| [18] |
KHALILI B, SMYTH A W SOD-YOLOv8: enhancing YOLOv8 for small object detection in aerial imagery and traffic scenes[J]. Sensors, 2024, 24 (19): 6209
|
|
|
| [19] |
SUNKARA R, LUO T. No more strided convolutions or pooling: a new CNN building block for low-resolution images and small objects [C]// Machine Learning and Knowledge Discovery in Databases. Grenoble: Springer, 2023: 443–459.
|
|
|
| [20] |
CUI Y, REN W, KNOLL A Omni-kernel modulation for universal image restoration[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (12): 12496- 12509
doi: 10.1109/TCSVT.2024.3429557
|
|
|
| [21] |
LIU C, LI H, WANG S, et al. A dataset and benchmark of underwater object detection for robot picking [C]// Proceedings of the IEEE International Conference on Multimedia & Expo Workshops. Shenzhen: IEEE, 2021: 1–6.
|
|
|
| [22] |
WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information [C]// European Conference on Computer Vision. Milan: Springer, 2024: 1–21.
|
|
|
| [23] |
WANG A, CHEN H, LIU L, et al. YOLOv10: real-time end-to-end object detection [EB/OL]. (2024-05-13) [2025-06-06]. https://arxiv.org/abs/2405.14458.
|
|
|
| [24] |
KHANAM R, HUSSAIN M. YOLOv11: an overview of the key architectural enhancements [EB/OL]. (2024-10-09) [2025-06-06]. https://arxiv.org/abs/2410.17725.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|