基于融合注意力机制的光学遥感图像小目标检测算法

doi:10.3785/j.issn.1008-973X.2026.04.008

浙江大学学报(工学版)

2026, Vol. 60

Issue (4): 763-771 DOI: 10.3785/j.issn.1008-973X.2026.04.008

计算机技术

基于融合注意力机制的光学遥感图像小目标检测算法

宋耀莲1(

),彭驰1,唐菁敏1,*(

),赵宣植1,虞贵财2

1. 昆明理工大学信息工程与自动化学院，云南昆明 650500
2. 青海民族大学物理与电子信息工程学院，青海西宁 810007

Small object detection algorithm for optical remote sensing images based on fusion attention mechanism

Yaolian SONG1(

),Chi PENG1,Jingmin TANG1,*(

),Xuanzhi ZHAO1,Guicai YU2

1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
2. School of Physics and Electronic Information Engineering, Qinghai Minzu University, Xining 810007, China

全文: PDF(2536 KB) HTML

摘要：

针对光学遥感图像中小目标检测特征提取受限、前背景混淆、漏检误检严重等问题，提出基于特征增强和融合注意力机制的小目标检测算法FMCM-YOLO. 设计四头检测模型，添加小目标检测层，用于检测光学遥感图像中众多小目标；在主干网络中提出特征增强模块，通过设计多分支卷积结构引入不同尺寸的空洞卷积，提高特征提取能力；在颈部网络中融合通道和空间注意力机制，并引入残差结构聚焦小目标，更易区分目标和背景；将MPDIoU作为模型损失函数，提升收敛速度，增强对小目标的检测能力. 实验结果表明，所提算法在USOD和AI-TOD这2个公开数据集上的mAP50分别达到89.9%和60.6%，相较于基线算法YOLOv5m分别提高了2.8和5.9个百分点，非常微小、微小和小目标的平均均值精度分别提升了2.1、6.5和5.1个百分点，可以看出FMCM-YOLO算法有效提升了光学遥感图像中小目标的检测性能.

关键词： 光学遥感图像; 小目标检测; YOLOv5; 特征增强; 注意力机制

Abstract:

A small object detection algorithm FMCM-YOLO based on feature enhancement and fusion attention mechanism was proposed, aiming at the challenges of limited feature extraction, foreground-background confusion, and severe missed and false detections in small object detection in optical remote sensing images. Firstly, a four-head detection model was designed and a small target detection layer was added to detect numerous small objects in optical remote sensing images. Secondly, a feature enhancement module was proposed in the backbone network, which improved feature extraction capability by designing a multi-branch convolutional structure and introducing dilated convolution of different sizes. Thirdly, channel and spatial attention mechanisms were incorporated into the neck network, and a residual structure was introduced to focus on small objects, facilitating the distinction between targets and backgrounds. Finally, MPDIoU was adopted as the model’s loss function to accelerate convergence and enhance detection performance for small objects. Experimental results demonstrated that the mAP50 of the proposed algorithm on the two public datasets, USOD and AI-TOD, reached 89.9% and 60.6% respectively, which were 2.8 and 5.9 percentage points higher than those of the baseline algorithm YOLOv5m. Especially, the mean average precision for extremely tiny, tiny, and small objects increased by 2.1, 6.5, and 5.1 percentage points, respectively. These results proved that the FMCM-YOLO algorithm effectively improved the detection performance of small targets in optical remote sensing images.

Key words: optical remote sensing image small target detection YOLOv5 feature enhancement attention mechanism

收稿日期: 2025-07-26 出版日期: 2026-03-19

CLC:

TP 753

基金资助: 国家自然科学基金资助项目(62261056)；国防科技重点实验室基金资助项目(23JCJQLB3301)；汉江国际国家实验室开放基金资助项目(KF2024025)；教育部产学合作协同育人项目（231107173102719).

通讯作者: 唐菁敏 E-mail: 39217149@qq.com;tang_min213@163.com

作者简介: 宋耀莲（1977—），女，副教授，博士，从事深度学习在遥感影像中的应用研究. orcid.org/0009-0007-7534-9644. E-mail：39217149@qq.com

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	宋耀莲
	彭驰
	唐菁敏
	赵宣植
	虞贵财

引用本文:

宋耀莲,彭驰,唐菁敏,赵宣植,虞贵财. 基于融合注意力机制的光学遥感图像小目标检测算法[J]. 浙江大学学报(工学版), 2026, 60(4): 763-771.

Yaolian SONG,Chi PENG,Jingmin TANG,Xuanzhi ZHAO,Guicai YU. Small object detection algorithm for optical remote sensing images based on fusion attention mechanism. Journal of ZheJiang University (Engineering Science), 2026, 60(4): 763-771.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.04.008 或 https://www.zjujournals.com/eng/CN/Y2026/V60/I4/763

图 1 FMCM-YOLO网络结构

图 2 特征增强模块结构

图 3 CASABlock结构

图 4 CASA模块结构

图 5 MPDIoU损失函数参数

图 6 数据集目标实例尺寸分布情况

表 1 模型训练超参数设置

表 2 不同改进点组合的消融实验结果分析

表 3 不同算法在USOD上的性能比较结果

表 4 不同算法在AI-TOD上的性能比较结果

表 5 改进前、后AI-TOD数据集各类别目标的检测性能比较

图 7 不同损失函数效果对比

图 8 模型改进前、后可视化检测效果对比

1	许夙晖, 慕晓冬, 柯冰, 等基于遥感影像的军事阵地动态监测技术研究[J]. 遥感技术与应用, 2014, 29 (3): 511- 516 XU Suhui, MU Xiaodong, KE Bing, et al Dynamic monitoring of military position based on remote sensing image[J]. Remote Sensing Technology and Application, 2014, 29 (3): 511- 516 doi: 10.11873/j.issn.1004-0323.2014.3.0511
2	姚艳清, 程塨, 谢星星, 等多分辨率特征融合的光学遥感图像目标检测[J]. 遥感学报, 2021, 25 (5): 1124- 1137 YAO Yanqing, CHENG Gong, XIE Xingxing, et al Optical remote sensing image object detection based on multi-resolution feature fusion[J]. National Remote Sensing Bulletin, 2021, 25 (5): 1124- 1137
3	禹文奇, 程塨, 王美君, 等 MAR20: 遥感图像军用飞机目标识别数据集[J]. 遥感学报, 2023, 27 (12): 2688- 2696 YU Wenqi, CHENG Gong, WANG Meijun, et al MAR20: a benchmark for military aircraft recognition in remote sensing images[J]. National Remote Sensing Bulletin, 2023, 27 (12): 2688- 2696 doi: 10.11834/jrs.20222139
4	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580–587.
5	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779–788.
6	LIU Z, GAO X, WAN Y, et al An improved YOLOv5 method for small object detection in UAV capture scenes[J]. IEEE Access, 2023, 11: 14365- 14374 doi: 10.1109/ACCESS.2023.3241005
7	QIU Y, SHA F, NIU L DKA-YOLO: enhanced small object detection via dilation kernel aggregation convolution modules[J]. IEEE Access, 2024, 12: 187353- 187366 doi: 10.1109/ACCESS.2024.3515201
8	许思源, 吴伟林多尺度特征融合的遥感图像目标检测算法研究[J]. 计算机工程与应用, 2024, 60 (23): 249- 256 XU Siyuan, WU Weilin Research on object detection algorithm for remote sensing images based on multi-scale fea-ture fusion[J]. Computer Engineering and Applications, 2024, 60 (23): 249- 256
9	CAI X, LAI Q, WANG Y, et al. Poly kernel inception network for remote sensing detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 27706–27716.
10	吴建成, 郭荣佐, 成嘉伟, 等注意力特征融合的快速遥感图像目标检测算法[J]. 计算机工程与应用, 2024, 60 (1): 207- 216 WU Jiancheng, GUO Rongzuo, CHENG Jiawei, et al Fast remote sensing image object detection algorithm based on attention feature fusion[J]. Computer Engineering and Applications, 2024, 60 (1): 207- 216 doi: 10.3778/j.issn.1002-8331.2303-0375
11	汪西莉, 梁正印, 刘涛基于特征注意力金字塔的遥感图像目标检测方法[J]. 遥感学报, 2023, 27 (2): 492- 501 WANG Xili, LIANG Zhengyin, LIU Tao Feature attention pyramid-based remote sensing image object detection method[J]. National Remote Sensing Bulletin, 2023, 27 (2): 492- 501
12	HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.
13	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// ECCV 2018. Munich: Springer, 2018: 3–19.
14	MA S, XU Y. MPDIoU: a loss for efficient and accurate bounding box regression [EB/OL]. (2023–07–14) [2025–07–15]. https://doi.org/10.48550/arXiv.2307.07662.
15	ZHANG Y, YE M, ZHU G, et al FFCA-YOLO for small object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5611215 doi: 10.1109/tgrs.2024.3363057
16	WANG J, YANG W, GUO H, et al. Tiny object detection in aerial images [C]// 25th International Conference on Pattern Recognition. Milan: IEEE, 2021: 3791–3798.
17	FU C Y, LIU W, RANGA A, et al. Dssd: deconvolutional single shot detector [EB/OL]. (2017–01–23) [2025–07–17]. https://doi.org/10.48550/arXiv.1701.06659.
18	JOCHER G, CHAURASIA A, QIU J. Ultralytics YOLOv8. [EB/OL]. (2023–04–03) [2025–07–17]. https://github.com/pytholic/ultralytics-yolov8.
19	ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios [C]// IEEE/CVF International Conference on Computer Vision Workshops. Montreal: IEEE, 2021: 2778–2788.
20	QI S, SONG X, SHANG T, et al MSFE-YOLO: an improved YOLOv8 network for object detection on drone view[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 6013605 doi: 10.1109/lgrs.2024.3432536
21	ZHANG W, LIU Z, ZHOU S, et al LS-YOLO: a novel model for detecting multiscale landslides with remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 4952- 4965 doi: 10.1109/JSTARS.2024.3363160
22	QIAO S, CHEN L C, YUILLE A. DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 10208–10219.
23	GUO G, CHEN P, YU X, et al Save the tiny, save the all: hierarchical activation network for tiny object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (1): 221- 234 doi: 10.1109/TCSVT.2023.3284161
24	ZHENG Z, WANG P, REN D, et al Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52 (8): 8574- 8586 doi: 10.1109/TCYB.2021.3095305
25	ZHANG Y F, REN W, ZHANG Z, et al Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146- 157 doi: 10.1016/j.neucom.2022.07.042
26	GEVORGYAN Z. SIoU loss: More powerful learning for bounding box regression [EB/OL]. (2022–05–25) [2025–07–17]. https://doi.org/10.48550/arXiv.2205.12740.

[1]	陈文强,冯琳越,王东丹,顾玉磊,赵轩. 融合动态风险图与多变量注意力机制的车辆轨迹预测模型[J]. 浙江大学学报(工学版), 2026, 60(3): 455-467.
[2]	胡从裕,殷晨波,马伟,杨超,颜士宽. 基于改进CNN-LSTM的挖掘机作业对象识别[J]. 浙江大学学报(工学版), 2026, 60(3): 536-545.
[3]	李彬彬,张超,覃涛,陈昌盛,刘兴艳,杨靖. 面向光伏电站建设的移动端人体跌倒检测方法[J]. 浙江大学学报(工学版), 2026, 60(3): 546-555.
[4]	李国燕,李鹏辉,刘榕,梅玉鹏,张明辉. 融合多尺度分辨率和带状特征的遥感道路提取[J]. 浙江大学学报(工学版), 2026, 60(3): 585-593.
[5]	方芳,严军,郭红想,王勇. 基于时空注意力机制的轻量级脑纹识别算法[J]. 浙江大学学报(工学版), 2026, 60(3): 633-642.
[6]	王爽,章熙泰,郭永存,孙守锁. 基于深度网络的可控混合式磁力耦合器退磁诊断[J]. 浙江大学学报(工学版), 2026, 60(2): 279-286.
[7]	孟昱煜,孔垂乐,火久元,武泽宇. 重构YOLOv11的无人机小目标检测算法[J]. 浙江大学学报(工学版), 2026, 60(2): 303-312.
[8]	李宪华,杜鹏飞,宋韬,邱洵,蔡钰. 基于多尺度滑窗注意力时序卷积网络的脑电信号分类[J]. 浙江大学学报(工学版), 2026, 60(2): 370-378.
[9]	杨明辉,宋牧原,付大喜,郭炎伟,卢贤锥,张文聪,郑伟龙. 基于多头自注意力-Bi-LSTM模型的盾构掘进引发的土体沉降预测[J]. 浙江大学学报(工学版), 2026, 60(2): 415-424.
[10]	周思瑶,夏楠,江佳鸿. 姿态引导的双分支换装行人重识别网络[J]. 浙江大学学报(工学版), 2026, 60(1): 71-80.
[11]	黄文湖,赵邢,谢亮,梁浩然,梁荣华. 基于对比学习的声源定位引导视听分割模型[J]. 浙江大学学报(工学版), 2025, 59(9): 1803-1813.
[12]	周著国,鲁玉军,吕利叶. 基于改进YOLOv5s的印刷电路板缺陷检测算法[J]. 浙江大学学报(工学版), 2025, 59(8): 1608-1616.
[13]	张学军,梁书滨,白万荣,张奉鹤,黄海燕,郭梅凤,陈卓. 基于异构图表征的源代码漏洞检测方法[J]. 浙江大学学报(工学版), 2025, 59(8): 1644-1652.
[14]	林宜山,左景,卢树华. 基于多头自注意力机制与MLP-Interactor的多模态情感分析[J]. 浙江大学学报(工学版), 2025, 59(8): 1653-1661.
[15]	翟亚红,陈雅玲,徐龙艳,龚玉. 改进YOLOv8s的轻量级无人机航拍小目标检测算法[J]. 浙江大学学报(工学版), 2025, 59(8): 1708-1717.

Viewed

Full text

Abstract

Cited

Shared

Discussed