Please wait a minute...
浙江大学学报(工学版)  2026, Vol. 60 Issue (4): 763-771    DOI: 10.3785/j.issn.1008-973X.2026.04.008
计算机技术     
基于融合注意力机制的光学遥感图像小目标检测算法
宋耀莲1(),彭驰1,唐菁敏1,*(),赵宣植1,虞贵财2
1. 昆明理工大学 信息工程与自动化学院,云南 昆明 650500
2. 青海民族大学 物理与电子信息工程学院,青海 西宁 810007
Small object detection algorithm for optical remote sensing images based on fusion attention mechanism
Yaolian SONG1(),Chi PENG1,Jingmin TANG1,*(),Xuanzhi ZHAO1,Guicai YU2
1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
2. School of Physics and Electronic Information Engineering, Qinghai Minzu University, Xining 810007, China
 全文: PDF(2536 KB)   HTML
摘要:

针对光学遥感图像中小目标检测特征提取受限、前背景混淆、漏检误检严重等问题,提出基于特征增强和融合注意力机制的小目标检测算法FMCM-YOLO. 设计四头检测模型,添加小目标检测层,用于检测光学遥感图像中众多小目标;在主干网络中提出特征增强模块,通过设计多分支卷积结构引入不同尺寸的空洞卷积,提高特征提取能力;在颈部网络中融合通道和空间注意力机制,并引入残差结构聚焦小目标,更易区分目标和背景;将MPDIoU作为模型损失函数,提升收敛速度,增强对小目标的检测能力. 实验结果表明,所提算法在USOD和AI-TOD这2个公开数据集上的mAP50分别达到89.9%和60.6%,相较于基线算法YOLOv5m分别提高了2.8和5.9个百分点,非常微小、微小和小目标的平均均值精度分别提升了2.1、6.5和5.1个百分点,可以看出FMCM-YOLO算法有效提升了光学遥感图像中小目标的检测性能.

关键词: 光学遥感图像小目标检测YOLOv5特征增强注意力机制    
Abstract:

A small object detection algorithm FMCM-YOLO based on feature enhancement and fusion attention mechanism was proposed, aiming at the challenges of limited feature extraction, foreground-background confusion, and severe missed and false detections in small object detection in optical remote sensing images. Firstly, a four-head detection model was designed and a small target detection layer was added to detect numerous small objects in optical remote sensing images. Secondly, a feature enhancement module was proposed in the backbone network, which improved feature extraction capability by designing a multi-branch convolutional structure and introducing dilated convolution of different sizes. Thirdly, channel and spatial attention mechanisms were incorporated into the neck network, and a residual structure was introduced to focus on small objects, facilitating the distinction between targets and backgrounds. Finally, MPDIoU was adopted as the model’s loss function to accelerate convergence and enhance detection performance for small objects. Experimental results demonstrated that the mAP50 of the proposed algorithm on the two public datasets, USOD and AI-TOD, reached 89.9% and 60.6% respectively, which were 2.8 and 5.9 percentage points higher than those of the baseline algorithm YOLOv5m. Especially, the mean average precision for extremely tiny, tiny, and small objects increased by 2.1, 6.5, and 5.1 percentage points, respectively. These results proved that the FMCM-YOLO algorithm effectively improved the detection performance of small targets in optical remote sensing images.

Key words: optical remote sensing image    small target detection    YOLOv5    feature enhancement    attention mechanism
收稿日期: 2025-07-26 出版日期: 2026-03-19
CLC:  TP 753  
基金资助: 国家自然科学基金资助项目(62261056);国防科技重点实验室基金资助项目(23JCJQLB3301);汉江国际国家实验室开放基金资助项目(KF2024025);教育部产学合作协同育人项目(231107173102719).
通讯作者: 唐菁敏     E-mail: 39217149@qq.com;tang_min213@163.com
作者简介: 宋耀莲(1977—),女,副教授,博士,从事深度学习在遥感影像中的应用研究. orcid.org/0009-0007-7534-9644. E-mail:39217149@qq.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
宋耀莲
彭驰
唐菁敏
赵宣植
虞贵财

引用本文:

宋耀莲,彭驰,唐菁敏,赵宣植,虞贵财. 基于融合注意力机制的光学遥感图像小目标检测算法[J]. 浙江大学学报(工学版), 2026, 60(4): 763-771.

Yaolian SONG,Chi PENG,Jingmin TANG,Xuanzhi ZHAO,Guicai YU. Small object detection algorithm for optical remote sensing images based on fusion attention mechanism. Journal of ZheJiang University (Engineering Science), 2026, 60(4): 763-771.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.04.008        https://www.zjujournals.com/eng/CN/Y2026/V60/I4/763

图 1  FMCM-YOLO网络结构
图 2  特征增强模块结构
图 3  CASABlock结构
图 4  CASA模块结构
图 5  MPDIoU损失函数参数
图 6  数据集目标实例尺寸分布情况
参数数值参数数值
批次大小16权重衰减系数0.005
训练轮次300学习率动量0.937
初始学习率0.01图片尺寸640×640
表 1  模型训练超参数设置
序号小目标层MFFMCASAMPDIoUP/%R/%mAP50/%mAP50:95/%Params/106GFLOPs
A88.581.587.131.920.8547.9
B89.782.087.932.721.2956.3
C89.583.287.831.921.6252.1
D91.482.488.633.320.9248.3
E89.983.188.232.820.8547.9
F91.784.089.133.022.1362.3
G90.783.088.732.722.1362.2
H91.383.389.032.921.3356.4
I91.583.789.332.921.6551.8
J92.384.189.934.122.1362.3
表 2  不同改进点组合的消融实验结果分析
模型P/%R/%mAP50/%mAP50:95/%Params/106FPS
RefineDet88.182.485.131.435.6832
YOLOv5m88.581.587.131.920.85258
YOLOv8m90.582.287.632.429.74155
TPH-YOLOv591.083.789.532.145.36134
MSFE-YOLO-m91.683.589.633.159.5137
LS-YOLO90.883.689.333.922.6153
L-FFCA-YOLO91.382.889.333.25.10165
FMCM-YOLO(本研究算法)92.384.189.934.122.13169
表 3  不同算法在USOD上的性能比较结果
模型mAP50/%mAP50:95/%mAPvt/%mAPt/%mAPs/%FPS
DedectoRS32.814.8010.828.361
M-CenterNet40.714.56.115.019.478
YOLOv5m54.721.710.522.127.0258
HANet53.722.110.922.227.3178
FFCA-YOLO61.727.712.624.931.8171
L-FFCA-YOLO58.325.511.723.230.1165
FMCM-YOLO(本研究算法)60.626.712.628.632.1169
表 4  不同算法在AI-TOD上的性能比较结果
类别mAP50/%
FMCM-YOLOYOLOv5m
all60.654.7
airplane66.964.3
bridge50.444.9
storage-tank88.977.9
ship78.975.0
swimming-pool51.751.2
vehicle77.769.9
person39.231.3
wind-mill33.423.3
表 5  改进前、后AI-TOD数据集各类别目标的检测性能比较
图 7  不同损失函数效果对比
图 8  模型改进前、后可视化检测效果对比
1 许夙晖, 慕晓冬, 柯冰, 等 基于遥感影像的军事阵地动态监测技术研究[J]. 遥感技术与应用, 2014, 29 (3): 511- 516
XU Suhui, MU Xiaodong, KE Bing, et al Dynamic monitoring of military position based on remote sensing image[J]. Remote Sensing Technology and Application, 2014, 29 (3): 511- 516
doi: 10.11873/j.issn.1004-0323.2014.3.0511
2 姚艳清, 程塨, 谢星星, 等 多分辨率特征融合的光学遥感图像目标检测[J]. 遥感学报, 2021, 25 (5): 1124- 1137
YAO Yanqing, CHENG Gong, XIE Xingxing, et al Optical remote sensing image object detection based on multi-resolution feature fusion[J]. National Remote Sensing Bulletin, 2021, 25 (5): 1124- 1137
3 禹文奇, 程塨, 王美君, 等 MAR20: 遥感图像军用飞机目标识别数据集[J]. 遥感学报, 2023, 27 (12): 2688- 2696
YU Wenqi, CHENG Gong, WANG Meijun, et al MAR20: a benchmark for military aircraft recognition in remote sensing images[J]. National Remote Sensing Bulletin, 2023, 27 (12): 2688- 2696
doi: 10.11834/jrs.20222139
4 GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580–587.
5 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779–788.
6 LIU Z, GAO X, WAN Y, et al An improved YOLOv5 method for small object detection in UAV capture scenes[J]. IEEE Access, 2023, 11: 14365- 14374
doi: 10.1109/ACCESS.2023.3241005
7 QIU Y, SHA F, NIU L DKA-YOLO: enhanced small object detection via dilation kernel aggregation convolution modules[J]. IEEE Access, 2024, 12: 187353- 187366
doi: 10.1109/ACCESS.2024.3515201
8 许思源, 吴伟林 多尺度特征融合的遥感图像目标检测算法研究[J]. 计算机工程与应用, 2024, 60 (23): 249- 256
XU Siyuan, WU Weilin Research on object detection algorithm for remote sensing images based on multi-scale fea-ture fusion[J]. Computer Engineering and Applications, 2024, 60 (23): 249- 256
9 CAI X, LAI Q, WANG Y, et al. Poly kernel inception network for remote sensing detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 27706–27716.
10 吴建成, 郭荣佐, 成嘉伟, 等 注意力特征融合的快速遥感图像目标检测算法[J]. 计算机工程与应用, 2024, 60 (1): 207- 216
WU Jiancheng, GUO Rongzuo, CHENG Jiawei, et al Fast remote sensing image object detection algorithm based on attention feature fusion[J]. Computer Engineering and Applications, 2024, 60 (1): 207- 216
doi: 10.3778/j.issn.1002-8331.2303-0375
11 汪西莉, 梁正印, 刘涛 基于特征注意力金字塔的遥感图像目标检测方法[J]. 遥感学报, 2023, 27 (2): 492- 501
WANG Xili, LIANG Zhengyin, LIU Tao Feature attention pyramid-based remote sensing image object detection method[J]. National Remote Sensing Bulletin, 2023, 27 (2): 492- 501
12 HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.
13 WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// ECCV 2018. Munich: Springer, 2018: 3–19.
14 MA S, XU Y. MPDIoU: a loss for efficient and accurate bounding box regression [EB/OL]. (2023–07–14) [2025–07–15]. https://doi.org/10.48550/arXiv.2307.07662.
15 ZHANG Y, YE M, ZHU G, et al FFCA-YOLO for small object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5611215
doi: 10.1109/tgrs.2024.3363057
16 WANG J, YANG W, GUO H, et al. Tiny object detection in aerial images [C]// 25th International Conference on Pattern Recognition. Milan: IEEE, 2021: 3791–3798.
17 FU C Y, LIU W, RANGA A, et al. Dssd: deconvolutional single shot detector [EB/OL]. (2017–01–23) [2025–07–17]. https://doi.org/10.48550/arXiv.1701.06659.
18 JOCHER G, CHAURASIA A, QIU J. Ultralytics YOLOv8. [EB/OL]. (2023–04–03) [2025–07–17]. https://github.com/pytholic/ultralytics-yolov8.
19 ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios [C]// IEEE/CVF International Conference on Computer Vision Workshops. Montreal: IEEE, 2021: 2778–2788.
20 QI S, SONG X, SHANG T, et al MSFE-YOLO: an improved YOLOv8 network for object detection on drone view[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 6013605
doi: 10.1109/lgrs.2024.3432536
21 ZHANG W, LIU Z, ZHOU S, et al LS-YOLO: a novel model for detecting multiscale landslides with remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 4952- 4965
doi: 10.1109/JSTARS.2024.3363160
22 QIAO S, CHEN L C, YUILLE A. DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 10208–10219.
23 GUO G, CHEN P, YU X, et al Save the tiny, save the all: hierarchical activation network for tiny object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (1): 221- 234
doi: 10.1109/TCSVT.2023.3284161
24 ZHENG Z, WANG P, REN D, et al Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52 (8): 8574- 8586
doi: 10.1109/TCYB.2021.3095305
25 ZHANG Y F, REN W, ZHANG Z, et al Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146- 157
doi: 10.1016/j.neucom.2022.07.042
26 GEVORGYAN Z. SIoU loss: More powerful learning for bounding box regression [EB/OL]. (2022–05–25) [2025–07–17]. https://doi.org/10.48550/arXiv.2205.12740.
[1] 陈文强,冯琳越,王东丹,顾玉磊,赵轩. 融合动态风险图与多变量注意力机制的车辆轨迹预测模型[J]. 浙江大学学报(工学版), 2026, 60(3): 455-467.
[2] 胡从裕,殷晨波,马伟,杨超,颜士宽. 基于改进CNN-LSTM的挖掘机作业对象识别[J]. 浙江大学学报(工学版), 2026, 60(3): 536-545.
[3] 李彬彬,张超,覃涛,陈昌盛,刘兴艳,杨靖. 面向光伏电站建设的移动端人体跌倒检测方法[J]. 浙江大学学报(工学版), 2026, 60(3): 546-555.
[4] 李国燕,李鹏辉,刘榕,梅玉鹏,张明辉. 融合多尺度分辨率和带状特征的遥感道路提取[J]. 浙江大学学报(工学版), 2026, 60(3): 585-593.
[5] 方芳,严军,郭红想,王勇. 基于时空注意力机制的轻量级脑纹识别算法[J]. 浙江大学学报(工学版), 2026, 60(3): 633-642.
[6] 王爽,章熙泰,郭永存,孙守锁. 基于深度网络的可控混合式磁力耦合器退磁诊断[J]. 浙江大学学报(工学版), 2026, 60(2): 279-286.
[7] 孟昱煜,孔垂乐,火久元,武泽宇. 重构YOLOv11的无人机小目标检测算法[J]. 浙江大学学报(工学版), 2026, 60(2): 303-312.
[8] 李宪华,杜鹏飞,宋韬,邱洵,蔡钰. 基于多尺度滑窗注意力时序卷积网络的脑电信号分类[J]. 浙江大学学报(工学版), 2026, 60(2): 370-378.
[9] 杨明辉,宋牧原,付大喜,郭炎伟,卢贤锥,张文聪,郑伟龙. 基于多头自注意力-Bi-LSTM模型的盾构掘进引发的土体沉降预测[J]. 浙江大学学报(工学版), 2026, 60(2): 415-424.
[10] 周思瑶,夏楠,江佳鸿. 姿态引导的双分支换装行人重识别网络[J]. 浙江大学学报(工学版), 2026, 60(1): 71-80.
[11] 黄文湖,赵邢,谢亮,梁浩然,梁荣华. 基于对比学习的声源定位引导视听分割模型[J]. 浙江大学学报(工学版), 2025, 59(9): 1803-1813.
[12] 周著国,鲁玉军,吕利叶. 基于改进YOLOv5s的印刷电路板缺陷检测算法[J]. 浙江大学学报(工学版), 2025, 59(8): 1608-1616.
[13] 张学军,梁书滨,白万荣,张奉鹤,黄海燕,郭梅凤,陈卓. 基于异构图表征的源代码漏洞检测方法[J]. 浙江大学学报(工学版), 2025, 59(8): 1644-1652.
[14] 林宜山,左景,卢树华. 基于多头自注意力机制与MLP-Interactor的多模态情感分析[J]. 浙江大学学报(工学版), 2025, 59(8): 1653-1661.
[15] 翟亚红,陈雅玲,徐龙艳,龚玉. 改进YOLOv8s的轻量级无人机航拍小目标检测算法[J]. 浙江大学学报(工学版), 2025, 59(8): 1708-1717.