|
|
|
| Object detection for multi-source remote sensing fused images based on depthwise separable convolution |
Jianghao CHEN1,2,3( ),Jun YANG1,2,3,4,*( ) |
1. Faculty of Geomatics, Lanzhou Jiaotong University, Lanzhou 730070, China 2. National and Local Joint Engineering Research Center of Geographical Monitoring Technology Application, Lanzhou 730070, China 3. Gansu Provincial Engineering Laboratory of Geographical Monitoring, Lanzhou 730070, China 4. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China |
|
|
|
Abstract A multi-source remote sensing image fusion and object detection network based on improved depthwise separable convolution and a multi-scale feature extraction module was proposed to address the limitation of convolutional downsampling in feature extraction and the problem of traditional feature-level fusion methods failing to fully leverage the complementary advantages of multi-source remote sensing data. A dual-branch separable convolution module was designed to enhance deep semantic feature representation through depthwise convolution and residual connections, thereby improving discriminative performance under complex backgrounds. Furthermore, a global-local adaptive feature fusion module was constructed, where feature maps were decomposed into different dimensional components using separable convolution to capture global structures and local details separately. These features were then fused via an adaptive mechanism to achieve cross-source information complementarity and multi-scale feature collaboration. Experiments on the VEDAI multi-source dataset demonstrated that the proposed method achieved a mean average precision (mAP) of 82.80%, which was 2.00 percentage points higher than that of ICAfusion, while also outperforming YOLOrs, YOLOfusion, SuperYOLO, and MF-YOLO. The network shows high effectiveness in feature-level fusion of multi-source remote sensing images and yields significant performance improvements in object detection tasks.
|
|
Received: 23 January 2025
Published: 25 November 2025
|
|
|
| Fund: 国家自然科学基金资助项目(42261067);2025年度甘肃省重点人才资助项目(2025RCXM031). |
|
Corresponding Authors:
Jun YANG
E-mail: 11220897@stu.lzjtu.edu.cn;yangj@mail.lzjtu.cn
|
结合深度可分离卷积的多源遥感融合影像目标检测
针对卷积下采样在遥感影像处理中特征提取能力不足,以及传统特征级融合方法未能充分发挥多源遥感数据互补优势的问题,提出结合改进深度可分离卷积与多尺度特征提取模块的多源遥感融合影像目标检测网络. 设计双分支可分离卷积模块,通过深度卷积与残差连接增强深层语义特征表达,提升复杂背景下的判别性能. 构建全局-局部自适应特征融合模块,利用分离卷积将特征图拆分为不同维度分量,分别捕获全局结构与局部细节,再通过自适应机制进行融合,实现跨源影像信息互补与多尺度特征协同. 实验在VEDAI多源数据集上验证,平均检测精度达到82.80%,较ICAfusion提升2.00个百分点,在与YOLOrs、YOLOfusion、SuperYOLO、MF-YOLO等方法对比中保持更优表现. 所提网络在多源遥感影像特征级融合方面展现出较高有效性,在目标检测任务中取得显著性能提升.
关键词:
多源遥感影像,
特征提取,
特征级融合,
深度可分离卷积,
多尺度特征,
目标检测
|
|
| [1] |
SUN X, TIAN Y, LU W, et al From single- to multi-modal remote sensing imagery interpretation: a survey and taxonomy[J]. Science China Information Sciences, 2023, 66 (4): 140301
|
|
|
| [2] |
李树涛, 李聪妤, 康旭东 多源遥感图像融合发展现状与未来展望[J]. 遥感学报, 2021, 25 (1): 148- 166 LI Shutao, LI Congyu, KANG Xudong Development status and future prospects of multi-source remote sensing image fusion[J]. National Remote Sensing Bulletin, 2021, 25 (1): 148- 166
doi: 10.11834/jrs.20210259
|
|
|
| [3] |
WU Y, GUAN X, ZHAO B, et al Vehicle detection based on adaptive multimodal feature fusion and cross-modal vehicle index using RGB-T images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16: 8166- 8177
|
|
|
| [4] |
GÜNTHER A, NAJJAR H, DENGEL A Explainable multimodal learning in remote sensing: challenges and future directions[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 1- 5
|
|
|
| [5] |
ZANG Y, WANG S, GUAN H, et al VAM-Net: vegetation-Attentive deep network for Multi-modal fusion of visible-light and vegetation-sensitive images[J]. International Journal of Applied Earth Observation and Geoinformation, 2024, 127: 103642
|
|
|
| [6] |
JIANG C, REN H, YANG H, et al M2FNet: multi-modal fusion network for object detection from visible and thermal infrared images[J]. International Journal of Applied Earth Observation and Geoinformation, 2024, 130: 103918
|
|
|
| [7] |
KULKARNI S C, REGE P P Pixel level fusion techniques for SAR and optical images: a review[J]. Information Fusion, 2020, 59: 13- 29
|
|
|
| [8] |
WU J, HAO F, LIANG W, et al Transformer fusion and pixel-level contrastive learning for RGB-D salient object detection[J]. IEEE Transactions on Multimedia, 2023, 26: 1011- 1026
|
|
|
| [9] |
FENG P, LIN Y, GUAN J, et al. Embranchment cnn based local climate zone classification using sar and multispectral remote sensing data [C]// IEEE International Geoscience and Remote Sensing Symposium. Yokohama: IEEE, 2019: 6344–6347.
|
|
|
| [10] |
曹琼, 马爱龙, 钟燕飞, 等 高光谱-LiDAR多级融合城区地表覆盖分类[J]. 遥感学报, 2019, 23 (5): 892- 903 CAO Qiong, MA Ailong, ZHONG Yanfei, et al Urban classification by multi-feature fusion of hyperspectral image and LiDAR data[J]. Journal of Remote Sensing, 2019, 23 (5): 892- 903
|
|
|
| [11] |
LI W, GAO Y, ZHANG M, et al Asymmetric feature fusion network for hyperspectral and SAR image classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 34 (10): 8057- 8070
|
|
|
| [12] |
YE Y, ZHANG J, ZHOU L, et al Optical and SAR image fusion based on complementary feature decomposition and visual saliency features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1- 15
|
|
|
| [13] |
LI L, HAN L, DING M, et al Multimodal image fusion framework for end-to-end remote sensing image registration[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1- 14
|
|
|
| [14] |
董红召, 林少轩, 佘翊妮 交通目标YOLO检测技术的研究进展[J]. 浙江大学学报: 工学版, 2025, 59 (2): 249- 260 DONG Hongzhao, LIN Shaoxuan, SHE Yini Research progress of YOLO detection technology for traffic object[J]. Journal of Zhejiang University: Engineering Science, 2025, 59 (2): 249- 260
|
|
|
| [15] |
宋耀莲, 王粲, 李大焱, 等 基于改进YOLOv5s的无人机小目标检测算法[J]. 浙江大学学报: 工学版, 2024, 58 (12): 2417- 2426 SONG Yaolian, WANG Can, LI Dayan, et al UAV small target detection algorithm based on improved YOLOv5s[J]. Journal of Zhejiang University: Engineering Science, 2024, 58 (12): 2417- 2426
|
|
|
| [16] |
ZHANG J, LEI J, XIE W, et al SuperYOLO: super resolution assisted object detection in multimodal remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1- 15
|
|
|
| [17] |
LI W, LI A, KONG X, et al. MF-YOLO: multimodal fusion for remote sensing object detection based on YOLOv5s [C]// 27th International Conference on Computer Supported Cooperative Work in Design. Tianjin: IEEE, 2024: 897–903.
|
|
|
| [18] |
ULTRALYTICS. YOLOv5 [EB/OL]. (2024−04−01) [2025−01−16]. https://github.com/ultralytics/yolov5.
|
|
|
| [19] |
MA X, DAI X, BAI Y, et al. Rewrite the Stars [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 5694–5703.
|
|
|
| [20] |
ZHENG M, SUN L, DONG J, et al. SMFANet: a lightweight self-modulation feature aggregation network for efficient image super-resolution [C]// European Conference on Computer Vision. Cham: Springer, 2024: 359–375.
|
|
|
| [21] |
SHEN J, CHEN Y, LIU Y, et al ICAFusion: iterative cross-attention guided feature fusion for multispectral object detection[J]. Pattern Recognition, 2024, 145: 109913
|
|
|
| [22] |
BOCHKOVSKIY A, WANG C, LIAO H Y. YOLOv4: optimal speed and accuracy of object detection [EB/OL]. (2020−04−23) [2024−12−11]. https://arxiv.org/ abs/2004.10934.
|
|
|
| [23] |
RAZAKARIVONY S, JURIE F Vehicle detection in aerial imagery: a small target detection benchmark[J]. Journal of Visual Communication and Image Representation, 2016, 34: 187- 203
|
|
|
| [24] |
FLIR ADA Team. FREE Teledyne FLIR Thermal Dataset for Algorithm Training [EB/OL]. (2024−05−01) [2025−01−21]. https://www.flir.com/oem/adas/adasdatasetform/.
|
|
|
| [25] |
HWANG S, PARK J, KIM N, et al. Multispectral pedestrian detection: benchmark dataset and baseline [C]// IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1037–1045.
|
|
|
| [26] |
SHARMA M, DHANARAJ M, KARNAM S, et al YOLOrs: object detection in multimodal remote sensing imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 14: 1497- 1508
|
|
|
| [27] |
FANG Q, WANG Z Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery[J]. Pattern Recognition, 2022, 130: 108786
|
|
|
| [28] |
DEVAGUPTAPU C, AKOLEKAR N, SHARMA M M, et al. Borrow from anywhere: pseudo multi-modal object detection in thermal imagery [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Long Beach: IEEE, 2019: 1029–1038.
|
|
|
| [29] |
ZHANG H, FROMONT E, LEFEVRE S, et al. Multispectral fusion for object detection with cyclic fuse-and-refine blocks [C]// IEEE International Conference on Image Processing. Abu Dhabi: IEEE, 2020: 276–280.
|
|
|
| [30] |
KIEU M, BAGDANOV A D, BERTINI M Bottom-up and layerwise domain adaptation for pedestrian detection in thermal images[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2021, 17 (1): 1- 19
|
|
|
| [31] |
ZHOU K, CHEN L, CAO X. Improving multispectral pedestrian detection by addressing modality imbalance problems [C]// Computer Vision–ECCV 2020: 16th European Conference. Cham: Springer, 2020: 787–803.
|
|
|
| [32] |
KIM J, KIM H, KIM T, et al MLPD: multi-label pedestrian detector in multispectral domain[J]. IEEE Robotics and Automation Letters, 2021, 6 (4): 7846- 7853
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|