Please wait a minute...
浙江大学学报(工学版)  2025, Vol. 59 Issue (12): 2527-2538    DOI: 10.3785/j.issn.1008-973X.2025.12.007
计算机技术     
代理注意力下域特征交互的高效图像去雾算法
杨燕(),贾存鹏
兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
An efficient image dehazing algorithm with Agent Attention for domain feature interaction
Yan YANG(),Cunpeng JIA
School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
 全文: PDF(6734 KB)   HTML
摘要:

针对Swin Transformer在图像去雾任务中难以平衡全局依赖关系与计算复杂度、细节信息捕获能力不足的问题,提出代理注意力下域特征交互的高效图像去雾算法. 以代理注意力替换多头自注意力,构建以代理Swin Transformer和高效多尺度注意力为基本单元的编解码网络,在降低模型计算复杂度的同时增强空间和通道特征之间的信息流动. 设计高频空间增强模块和低频通道增强模块,在特征提取的同时减少空间特征冗余,提高频域信息的有效性,并以跳跃连接的方式对空间域特征进行补偿. 在编码器中间层构造快速傅里叶卷积密集残差结构,利用频谱信息提升图像恢复视觉效果. 实验表明,所提算法可以降低模型计算复杂度和特征冗余,显著提升推理速度,且恢复图像的细节纹理完整,各项客观指标均较优.

关键词: 图像去雾代理Swin Transformer高效多尺度注意力小波变换特征增强    
Abstract:

An efficient image dehazing algorithm incorporating Agent Attention and domain feature interaction was developed to address Swin Transformer’s limitations in balancing global dependencies with computational complexity and capturing adequate detail information for image dehazing tasks. The multi-head self-attention was replaced with Agent Attention to construct an encoder-decoder network based on Agent Swin Transformer and Efficient Multi-Scale Attention as fundamental units. This architectural modification reduced the model’s computational complexity while simultaneously enhancing information flow between spatial and channel features. A high-frequency spatial enhancement module and a low-frequency channel enhancement module were designed to reduce spatial feature redundancy and improve the effectiveness of frequency-domain information while extracting features, and spatial domain features were compensated via skip connections. A fast Fourier convolution dense residual structure was constructed in the intermediate layers of the encoder to utilize spectral information for enhancing visual restoration effects. Experiments showed that the proposed algorithm could reduce the model’s computational complexity and feature redundancy, significantly enhance inference speed, restore image detail textures while maintaining their integrity, and achieve superior performance across various objective metrics.

Key words: image dehazing    Agent Swin Transformer    efficient multi-scale attention    wavelet transform    feature enhancement
收稿日期: 2024-12-05 出版日期: 2025-11-25
CLC:  TP 391.4  
基金资助: 国家自然科学基金资助项目(61561030,62063014);甘肃省高等学校产业支撑计划资助项目(2021CYZC-04);兰州交通大学研究生教改项目(JG201928).
作者简介: 杨燕(1972—)女,教授,博士,从事数字图像处理、语音信号处理和智能信息处理研究. orcid.org/0000-0001-5338-0762. E-mail:yangyantd@mail.lzjtu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
杨燕
贾存鹏

引用本文:

杨燕,贾存鹏. 代理注意力下域特征交互的高效图像去雾算法[J]. 浙江大学学报(工学版), 2025, 59(12): 2527-2538.

Yan YANG,Cunpeng JIA. An efficient image dehazing algorithm with Agent Attention for domain feature interaction. Journal of ZheJiang University (Engineering Science), 2025, 59(12): 2527-2538.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.12.007        https://www.zjujournals.com/eng/CN/Y2025/V59/I12/2527

图 1  整体去雾网络结构框图
图 2  不同计算范式下的多头自注意力
图 3  高效多尺度注意力(EMA)网络结构图
图 4  小波变换示意图
图 5  高频空间增强模块(HSEM)结构图
图 6  低频通道增强模块(LCEM)结构图
图 7  快速傅里叶卷积残差块(FFCR)结构图
图 8  不同算法在SOTS数据集上的复原结果
图 9  不同算法在NH-HAZE测试集上的复原结果
图 10  不同算法在O-HAZE数据集上的复原结果
图 11  不同算法在I-HAZE数据集上的复原结果
图 12  不同算法在真实集上的复原结果
算法SOTS-indoorSOTS-outdoorNH-HAZEO-HAZEI-HAZE
PSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIM
GCA-Net25.340.926126.190.900423.030.730418.830.708914.930.6403
SGID22.630.876224.970.851422.810.726517.250.653616.510.6953
UCL23.760.897324.350.876121.980.701318.860.673516.790.7162
DEA-Net27.270.944128.110.926324.450.826419.730.801518.310.7616
GridDehazeNet24.720.916125.350.874920.120.711919.920.686317.630.7211
EPDN23.060.871825.570.863021.420.727117.630.706216.050.6956
Dehazeformer25.540.921326.930.942024.960.850520.130.786418.460.7546
本研究算法26.710.931028.590.961326.890.901421.750.798219.720.7825
表 1  不同算法在不同测试集上的评价指标
算法NP/106FLOPs/109t/s
GCA-Net2.6862.960.163
SGID13.87625.610.416
UCL19.45183.340.204
DEA-Net3.65128.920.179
GridDehazeNet0.9685.710.211
EPDN17.3819.200.185
Dehazeformer2.5193.990.244
本研究算法2.7497.640.158
表 2  参数量和复杂性对比
图 13  不同设置下的训练损失变化情况
ModelAgent
attention
EMAHSEMLCEMFFCRPSNR/dBSSIM
A23.840.8505
B24.560.8546
C24.930.8693
D26.140.8753
E25.160.8931
F26.410.8893
G
(本研究模型)
26.890.9014
表 3  去雾性能消融对比
图 14  消融实验主观对比
ModelNP/106FLOPs/109t/s
13.23138.340.287
23.21135.310.184
32.84110.760.177
42.76102.430.163
5(本研究算法)2.7497.640.158
表 4  去雾效率消融对比
1 贾童瑶, 卓力, 李嘉锋, 等 基于深度学习的单幅图像去雾研究进展[J]. 电子学报, 2023, 51 (1): 231- 245
JIA Tongyao, ZHUO Li, LI Jiafeng, et al Research advances on deep learning based single image dehazing[J]. Acta Electronica Sinica, 2023, 51 (1): 231- 245
2 PIZER S M, AMBURN E P, AUSTIN J D, et al Adaptive histogram equalization and its variations[J]. Computer Vision, Graphics, and Image Processing, 1987, 39 (3): 355- 368
doi: 10.1016/S0734-189X(87)80186-X
3 GROSSMANN A, MORLET J Decomposition of hardy functions into square integrable wavelets of constant shape[J]. SIAM Journal on Mathematical Analysis, 1984, 15 (4): 723- 736
doi: 10.1137/0515056
4 LAND E H, MCCANN J J Lightness and retinex theory[J]. JOSA, 1971, 61 (1): 1- 11
doi: 10.1364/JOSA.61.000001
5 HE K, SUN J, TANG X. Single image haze removal using dark channel prior [C]// IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 1956–1963.
6 MENG G, WANG Y, DUAN J, et al. Efficient image dehazing with boundary constraint and contextual regularization [C]// IEEE International Conference on Computer Vision. Sydney: IEEE, 2013: 617–624.
7 ZHU Q, YANG S, XIE Y. An improved single image haze removal algorithm based on dark channel prior and histogram specification [C]// 3rd International Conference on Multimedia Technology (ICMT-13). Beijing: Atlantis Press, 2013: 279−292.
8 BERMAN D, TREIBITZ T, AVIDAN S. Non-local image dehazing [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 1674–1682.
9 WANG W, YUAN X, WU X, et al Fast image dehazing method based on linear transformation[J]. IEEE Transactions on Multimedia, 2017, 19 (6): 1142- 1155
doi: 10.1109/TMM.2017.2652069
10 LI D, TANG G, ZHAO L, et al. Single I mage haze removal based on concentration scale prior [C]// 5th International Conference on Computer and Communication Systems. Shanghai: IEEE, 2020: 309–313.
11 LI B, PENG X, WANG Z, et al. AOD-net: all-in-one dehazing network [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 4780–4788.
12 CHEN D, HE M, FAN Q, et al. Gated context aggregation network for image dehazing and deraining [C]// IEEE Winter Conference on Applications of Computer Vision. Waikoloa Village: IEEE, 2019: 1375−1383.
13 BAI H, PAN J, XIANG X, et al Self-guided image dehazing using progressive feature fusion[J]. IEEE Transactions on Image Processing, 2022, 31: 1217- 1229
doi: 10.1109/TIP.2022.3140609
14 ZHOU H, DONG W, LIU Y, et al. Breaking through the haze: an advanced non-homogeneous dehazing method based on fast Fourier convolution and ConvNeXt [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Vancouver: IEEE, 2023: 1895–1904.
15 WANG Y, YAN X, WANG F L, et al UCL-dehaze: toward real-world image dehazing via unsupervised contrastive learning[J]. IEEE Transactions on Image Processing, 2024, 33: 1361- 1374
doi: 10.1109/TIP.2024.3362153
16 CHEN Z, HE Z, LU Z M DEA-net: single image dehazing based on detail-enhanced convolution and content-guided attention[J]. IEEE Transactions on Image Processing, 2024, 33: 1002- 1015
doi: 10.1109/TIP.2024.3354108
17 SONG Y, HE Z, QIAN H, et al Vision transformers for single image dehazing[J]. IEEE Transactions on Image Processing, 2023, 32: 1927- 1941
doi: 10.1109/TIP.2023.3256763
18 HAN D, YE T, HAN Y, et al. Agent attention: on the integration of softmax and linear attention [C]// European Conference on Computer Vision. [S. l. ]: Springer, 2024: 124–140.
19 OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning [C]// ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes: IEEE, 2023: 1–5.
20 WU Y, HE K. Group normalization [C]// Proceedings of the European Conference on Computer Vision (ECCV). Munich: Springer, 2018: 3−19.
21 LI J, WEN Y, HE L. SCConv: spatial and channel reconstruction convolution for feature redundancy [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 6153–6162.
22 ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake: IEEE, 2018: 6848–6856.
23 ZHANG T, QI G J, XIAO B, et al. Interleaved group convolutions [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 4383–4392.
24 ZHANG Q, WU Y N, ZHU S C. Interpretable convolutional neural networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake: IEEE, 2018: 8827–8836.
25 CHI L, JIANG B, MU Y Fast Fourier convolution[J]. Advances in Neural Information Processing Systems, 2020, 33: 4479- 4488
26 KINGA D, ADAM J B. A method for stochastic optimization [C]// International Conference on Learning Representations (ICLR). San Diego: [s. n. ], 2015: 5−6.
27 LI B, REN W, FU D, et al. Benchmarking single image dehazing and beyond [J]. IEEE Transactions on Image Processing, 2018.
28 ANCUTI C O, ANCUTI C, TIMOFTE R. NH-HAZE: an image dehazing benchmark with non-homogeneous hazy and haze-free images [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle. IEEE, 2020: 1798–1805.
29 HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 558–567.
30 ANCUTI C O, ANCUTI C, TIMOFTE R, et al. O-haze: a dehazing benchmark with real hazy and haze-free outdoor images [C]// IEEE Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake: IEEE, 2018: 754–762.
31 ANCUTI C, ANCUTI C O, TIMOFTE R, et al. I-HAZE: A dehazing benchmark with real hazy and haze-free indoor images [C]// Advanced Concepts for Intelligent Vision Systems: 19th International Conference. Poitiers: Springer International Publishing, 2018: 620−631.
32 GIRSHICK R. Fast R-CNN [EB/OL]. (2015−04−30) [2024−10−15]. https://arxiv.org/abs/1504.08083.
33 WANG Z, BOVIK A C, SHEIKH H R, et al Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13 (4): 600- 612
doi: 10.1109/TIP.2003.819861
34 JOHNSON J, ALAHI A, LI F F. Perceptual losses for real-time style transfer and super-resolution [C]// Computer Vision–ECCV 2016: 14th European Conference. Cham: Springer, 2016: 694–711.
35 LIU X, MA Y, SHI Z, et al. GridDehazeNet: attention-based multi-scale network for image dehazing [C]// IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 7313−7322.
[1] 黄文湖,赵邢,谢亮,梁浩然,梁荣华. 基于对比学习的声源定位引导视听分割模型[J]. 浙江大学学报(工学版), 2025, 59(9): 1803-1813.
[2] 段继忠,李海源. 基于变分模型和Transformer的多尺度并行磁共振成像重建[J]. 浙江大学学报(工学版), 2025, 59(9): 1826-1837.
[3] 侯越,王甜甜,张鑫,尹杰. 多分辨率趋势周期解耦交互的交通流预测[J]. 浙江大学学报(工学版), 2025, 59(7): 1362-1372.
[4] 杨燕,晁丽鹏. 基于多维协同注意力的双支特征联合去雾网络[J]. 浙江大学学报(工学版), 2025, 59(6): 1119-1129.
[5] 蒋沁诚,陶建峰,王洋洋,张宇磊,刘成良. 基于EWT-LSTM的工业机器人关节异常检测[J]. 浙江大学学报(工学版), 2025, 59(5): 982-994.
[6] 周昶清,侯耀春,武鹏,杨帅,吴大转. 自适应齿轮箱稀疏表示原子构建方法[J]. 浙江大学学报(工学版), 2025, 59(5): 1018-1030.
[7] 孟小哲,冯钰新,苏卓,周凡. 基于不变学习的真实雾霾去除方法[J]. 浙江大学学报(工学版), 2024, 58(2): 268-278.
[8] 朱炳洋,吴建锋,王柯,王章权,刘半藤. 基于单通道ECG信号与INFO-ABCLogitBoost模型的睡眠分期[J]. 浙江大学学报(工学版), 2024, 58(12): 2547-2555.
[9] 张灵维,周正东,许云飞,王嘉文,吉文韬,宋泽峰. 基于特征融合的语言想象脑电信号分类[J]. 浙江大学学报(工学版), 2023, 57(4): 726-734.
[10] 成宏达,骆海明,夏庆超,杨灿军. 基于改进γ-CLAHE算法的水下机器人图像识别[J]. 浙江大学学报(工学版), 2022, 56(8): 1648-1655.
[11] 温竹鹏,陈捷,刘连华,焦玲玲. 基于小波变换和优化CNN的风电齿轮箱故障诊断[J]. 浙江大学学报(工学版), 2022, 56(6): 1212-1219.
[12] 温佩芝,陈君谋,肖雁南,温雅媛,黄文明. 基于生成式对抗网络和多级小波包卷积网络的水下图像增强算法[J]. 浙江大学学报(工学版), 2022, 56(2): 213-224.
[13] 谢誉,包梓群,张娜,吴彪,涂小妹,包晓安. 基于特征优化与深层次融合的目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2403-2415.
[14] 李明,段立娟,王文健,恩擎. 基于显著稀疏强关联的脑功能连接分类方法[J]. 浙江大学学报(工学版), 2022, 56(11): 2232-2240.
[15] 刘嘉诚,冀俊忠. 基于宽度学习系统的fMRI数据分类方法[J]. 浙江大学学报(工学版), 2021, 55(7): 1270-1278.