Please wait a minute...
浙江大学学报(工学版)  2025, Vol. 59 Issue (8): 1718-1726    DOI: 10.3785/j.issn.1008-973X.2025.08.019
计算机技术、控制工程、通信技术     
基于Convnextv2与纹理边缘引导的伪装目标检测
付家瑞1(),李兆飞1,2,3,*(),周豪1,黄惟1
1. 四川轻化工大学 自动化与信息工程学院,四川 宜宾 644000
2. 智能感知与控制四川省重点实验室,四川 宜宾 644000
3. 企业信息化与物联网测控技术四川省高校重点实验室,四川 宜宾 644000
Camouflaged object detection based on Convnextv2 and texture-edge guidance
Jiarui FU1(),Zhaofei LI1,2,3,*(),Hao ZHOU1,Wei HUANG1
1. College of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 644000, China
2. Intelligent Perception and Control Key Laboratory of Sichuan Province, Yibin 644000, China
3. Key Laboratory of Higher Education of Sichuan Province for Enterprise Informationalization and Internet of Things, Yibin 644000, China
 全文: PDF(1894 KB)   HTML
摘要:

为了解决伪装目标检测中目标的边缘特征及对应场景下独特纹理特征信息表达处理不足的问题,提出基于Convnextv2与纹理边缘引导的伪装目标检测算法. 通过纹理编码模块在输入图片上提取纹理特征,与主干网络提取的边缘特征进行融合,生成图片的纹理-边缘特征. 通过设计的纹理边缘引导的注意力模块,将纹理-边缘特征融入主干特征以定位目标的真实位置. 利用特征融合模块进行多层次特征融合,采用多级监督的方式,设计总的损失函数. 在3个公开数据集CAMO、COD10K、NC4K和迷彩伪装混合数据集MICAI_TE上的实验表明,该算法的综合性能最优.

关键词: 伪装目标检测纹理边缘引导特征融合Convnextv2特征提取纹理边缘注意力机制    
Abstract:

A camouflaged object detection method based on Convnextv2 and texture-edge guidance was proposed in order to address the issue of insufficient expression and processing of edge features of targets and unique texture feature information in corresponding scenarios in camouflaged object detection. The texture encoding module was used to extract texture features from input images, which were fused with the edge features extracted by the backbone network to generate texture-edge features of the images. The texture-edge features were integrated into the backbone features to locate the true position of the target through the designed texture-edge guided attention module. A feature fusion module was employed for multi-level feature fusion, and a multi-level supervision approach was adopted to design the overall loss function. Experiments on three public datasets (CAMO, COD10K, NC4K) and the camouflage mixed dataset MICAI_TE showed that the algorithm achieved optimal comprehensive performance.

Key words: camouflage object detection    texture-edge-guided feature fusion    Convnextv2    feature extraction    texture edge attention mechanism
收稿日期: 2024-08-08 出版日期: 2025-07-28
:  TP 391  
基金资助: 企业信息化与物联测控技术四川省重点实验室资助项目(2022WZJ02);自贡市重点科技计划资助项目(2019YYJC15);四川轻化工大学科研基金资助项目(2020RC32);四川轻化工大学研究生课程建设项目(AL202213,SZ202310);四川轻化工大学教学改革项目(2024KCSZ-ZY03,2024KCSZ-KC09,JG-24064).
通讯作者: 李兆飞     E-mail: izayoisakur_ray@163.com;lizhaofei825@163.com
作者简介: 付家瑞(1999—),男,硕士生,从事目标检测、伪装目标检测的研究. orcid.org/0009-0005-6799-6206. E-mail:izayoisakur_ray@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
付家瑞
李兆飞
周豪
黄惟

引用本文:

付家瑞,李兆飞,周豪,黄惟. 基于Convnextv2与纹理边缘引导的伪装目标检测[J]. 浙江大学学报(工学版), 2025, 59(8): 1718-1726.

Jiarui FU,Zhaofei LI,Hao ZHOU,Wei HUANG. Camouflaged object detection based on Convnextv2 and texture-edge guidance. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1718-1726.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.08.019        https://www.zjujournals.com/eng/CN/Y2025/V59/I8/1718

图 1  CTEGAFNet的网络结构
图 2  纹理编码模块的结构
图 3  边缘编码模块的结构
图 4  纹理-边缘引导注意力模块的结构
图 5  融合模块的结构
网络CAMO-TESTCOD10K-TESTNp/106
$ S_{\alpha} $$ {F_\beta ^{\omega}} $$ E_{\phi} $$ \mathrm{MAE} $$ S_{\alpha} $$ {F_\beta ^{\omega}} $$ E_{\phi} $$ \mathrm{MAE} $
MSCAF0.8730.8280.9290.0460.8650.7750.9270.02428.33
SARNet0.8680.8280.9270.0470.8640.8000.9310.02444.79
FSNet0.8800.8610.9330.0410.8700.8100.9380.023124.53
HitNet0.8440.8010.9020.0570.8680.7980.9320.02424.53
SegMaR0.8150.7420.8720.0710.8330.7240.8950.03368.04
SINet0.7450.6440.8290.0920.7760.6310.8640.04348.95
SINetV20.8200.7430.8820.0700.8150.6800.8870.03726.98
C2FNet0.7960.7190.8640.0800.8130.6860.8900.03626.30
BGNet0.8120.7490.8700.0730.8310.7220.9010.03374.20
DGNet0.8390.7690.9010.0570.8220.6930.8960.03321.02
ZoomNet0.8200.7520.8830.0660.8380.7290.8930.02932.38
CTEGAFNet0.8930.8580.9370.0370.8790.8010.9330.02192.94
表 1  CTEGAFNet与其他11种算法在CAMO和COD10K上的对比结果
网络NC4KMICAI_TENp/106
${S_\alpha } $${F_\beta ^{\omega}}$${E_\phi }$${\mathrm{MAE}}$${S_\alpha } $${F_\beta ^{\omega}} $${E_\phi } $${\mathrm{MAE}} $
MSCAF0.8870.8390.9350.0320.8900.8190.9460.01428.33
SARNet0.8860.8420.9370.0320.8880.8110.9440.01444.79
FSNet0.8910.8660.9400.0310.8870.8110.9430.014124.53
HitNet0.8700.8250.9210.0390.8860.8220.9550.01424.53
SegMaR0.8410.7810.9050.0460.8740.7820.9200.01968.04
SINet0.8080.7230.8710.0580.6780.3870.6240.05248.95
SINetV20.8470.7700.9030.0480.7330.5080.7390.03826.98
C2FNet0.8380.7620.8970.0490.8670.7760.9330.01926.30
BGNet0.8510.7880.9070.0440.7250.5200.7870.04374.20
DGNet0.8570.7840.9110.0420.8720.7790.9280.01821.02
ZoomNet0.8530.7840.9070.0430.8450.7250.8430.03032.38
CTEGAFNet0.9000.8590.9400.0280.8950.8270.9530.01392.94
表 2  CTEGAFNet与其他11种算法在NC4K和MICAI_TE上的对比结果
图 6  与10种不同COD方法的检测结果视觉比较
图 7  4-Octopus 原始图片
图 8  Att(Without T-Edge)模块的结构
算法${S_\alpha } $${F_\beta ^{\omega}} $${E_\phi }$${\mathrm{MAE }} $
Base0.8770.8120.9170.046
Base+Fus0.8910.8560.9340.037
Base+Att(Without T-Edge)0.6280.3960.7360.173
Base+Att(with T-Edge)0.8900.8560.9370.037
Base+Fus+Att(Without T-Edge)0.8880.8550.9350.038
Base+Fus+Att(with T-Edge)0.8930.8580.9370.037
表 3  在CAMO数据集上开展消融实验的结果
图 9  消融实验中不同模型的视觉结果对比
图 10  Base+Fus网络的结构
1 张冬冬, 王春平, 付强 伪装目标检测研究进展[J]. 激光杂志, 2024, 45 (3): 1- 13
ZHANG Dongdong, WANG Chunping, FU Qiang Research developments in camouflage object detection[J]. Laser Journal, 2024, 45 (3): 1- 13
2 SUN Y, CHEN G, ZHOU T, et al. Context-aware cross-level fusion network for camouflaged object detection [EB/OL]. [2025-05-29]. https://arxiv.org/abs/2105.12555.
3 REN J, HU X, ZHU L, et al Deep texture-aware features for camouflaged object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 33 (3): 1157- 1167
4 JI G P, FAN D P, CHOU Y C, et al Deep gradient learning for efficient camouflaged object detection[J]. Machine Intelligence Research, 2023, 20 (1): 92- 108
doi: 10.1007/s11633-022-1365-9
5 SUN Y, WANG S, CHEN C, et al. Boundary-guided camouflaged object detection [C]// International Joint Conference on Artificial Intelligence. Shenzhen: Morgan Kaufmann, 2022: 335-1341.
6 CHEN Tianrun, ZHU Lanyun, DENG Chaotao, et al. SAM2-Adapter: evaluating and adapting segment anything 2 in downstream tasks: camouflage, shadow, medical image segmentation, and more [EB/OL]. [2024-10-19]. https://arxiv.org/abs/2408.04579.
7 NIKHILA R, VALENTIN G, HU Y, et al. SAM 2: segment anything in images and videos [EB/OL]. [2024-10-19]. https://arxiv.org/abs/2408.00714.
8 CHEN G, WANG H, CHEN K, et al A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020, 52 (2): 936- 953
9 LIU Y, LI H, CHENG J, et al MSCAF-net: a general framework for camouflaged object detection via learning multi-scale context-aware features[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33 (9): 4934- 4947
doi: 10.1109/TCSVT.2023.3245883
10 WOO S, DEBNATH S, HU R, et al. Convnext v2: co-designing and scaling convnets with masked autoencoders [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 16133-16142.
11 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
12 DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. [2025-05-29]. https://arxiv.org/abs/2010.11929.
13 WU Z, SU L, HUANG Q. Cascaded partial decoder for fast and accurate salient object detection [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3907-3916.
14 ZHAO J X, LIU J J, FAN D P, et al. EGNet: edge guidance network for salient object detection [C]// IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 8779-8788.
15 WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// European Conference on Computer Vision. Munich: Springer, 2018: 3-19.
16 MILLETARI F, NAVAB N, AHMADI S A. V-net: fully convolutional neural networks for volumetric medical image segmentation [C]// International Conference on 3D Vision. California: IEEE, 2016: 565-571.
17 LE T N, NGUYEN T V, NIE Z, et al Anabranch network for camouflaged object segmentation[J]. Computer Vision and Image Understanding, 2019, 184: 45- 56
doi: 10.1016/j.cviu.2019.04.006
18 FAN D P, JI G P, SUN G, et al. Camouflaged object detection [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 2777-2787.
19 LV Y, ZHANG J, DAI Y, et al. Simultaneously localize, segment and rank the camouflaged objects [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 11591-11601.
20 FAN D P, JI G P, CHENG M M, et al Concealed object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44 (10): 6024- 6042
21 XING H, GAO S, WANG Y, et al Go closer to see better: camouflaged object detection via object area amplification and figure-ground conversion[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33 (10): 5444- 5457
doi: 10.1109/TCSVT.2023.3255304
22 JIA Q, YAO S, LIU Y, et al. Segment, magnify and reiterate: detecting camouflaged objects the hard way [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 4713-4722.
23 SONG Z, KANG X, WEI X, et al Fsnet: focus scanning network for camouflaged object detection[J]. IEEE Transactions on Image Processing, 2023, 32: 2267- 2278
doi: 10.1109/TIP.2023.3266659
24 HU X, WANG S, QIN X, et al. High-resolution iterative feedback network for camouflaged object detection [C]// AAAI Conference on Artificial Intelligence. Washington: AAAI, 2023, 37(1): 881-889.
[1] 周著国,鲁玉军,吕利叶. 基于改进YOLOv5s的印刷电路板缺陷检测算法[J]. 浙江大学学报(工学版), 2025, 59(8): 1608-1616.
[2] 梁礼明,龙鹏威,金家新,李仁杰,曾璐. 基于改进YOLOv8s的钢材表面缺陷检测算法[J]. 浙江大学学报(工学版), 2025, 59(3): 512-522.
[3] 傅幼萍,张航,厉梦菡,孟濬. 基于脉搏波信号多维度特征的身份识别[J]. 浙江大学学报(工学版), 2025, 59(3): 566-576.
[4] 王博特,王卿,刘强,金波. 基于多通道振动主元特征的风电机组叶片自监督异常识别方法[J]. 浙江大学学报(工学版), 2025, 59(3): 653-660.
[5] 林俊杰,朱雅光,刘春潮,刘昊洋. 面向移动作业的腿足机器人数字孪生系统[J]. 浙江大学学报(工学版), 2024, 58(9): 1956-1969.
[6] 王海军,王涛,俞慈君. 基于递归量化分析的CFRP超声检测缺陷识别方法[J]. 浙江大学学报(工学版), 2024, 58(8): 1604-1617.
[7] 韩康,战洪飞,余军合,王瑞. 基于空洞卷积和增强型多尺度特征自适应融合的滚动轴承故障诊断[J]. 浙江大学学报(工学版), 2024, 58(6): 1285-1295.
[8] 钟博,王鹏飞,王乙乔,王晓玲. 基于深度学习的EEG数据分析技术综述[J]. 浙江大学学报(工学版), 2024, 58(5): 879-890.
[9] 罗钒睿,刘振宇,任佳辉,李笑宇,程阳. 基于改进卡尔曼滤波的轻量级激光惯性里程计[J]. 浙江大学学报(工学版), 2024, 58(11): 2280-2289.
[10] 蒋林,刘林锐,周安娜,韩璐,李平原. 基于运动预测的改进ORB-SLAM算法[J]. 浙江大学学报(工学版), 2023, 57(1): 170-177.
[11] 卞艳,宫雨生,马国鹏,王昶. 基于无人机遥感影像的水体提取方法[J]. 浙江大学学报(工学版), 2022, 56(4): 764-774.
[12] 徐泽鑫,段立娟,王文健,恩擎. 基于上下文特征融合的代码漏洞检测方法[J]. 浙江大学学报(工学版), 2022, 56(11): 2260-2270.
[13] 刘芳,汪震,刘睿迪,王锴. 基于组合损失函数的BP神经网络风力发电短期预测方法[J]. 浙江大学学报(工学版), 2021, 55(3): 594-600.
[14] 冯毅雄,李康杰,高一聪,郑浩,谭建荣. 基于特征与形貌重构的轴件表面缺陷检测方法[J]. 浙江大学学报(工学版), 2020, 54(3): 427-434.
[15] 乔美英,汤夏夏,闫书豪,史建柯. 基于改进稀疏滤波与深度网络融合的轴承故障诊断[J]. 浙江大学学报(工学版), 2020, 54(12): 2301-2309.