基于动态注意力网络的图像超分辨率重建

doi:10.3785/j.issn.1008-973X.2023.08.002

浙江大学学报(工学版)

2023, Vol. 57

Issue (8): 1487-1494 DOI: 10.3785/j.issn.1008-973X.2023.08.002

计算机技术

基于动态注意力网络的图像超分辨率重建

赵小强1,2,3(

),王泽1,宋昭漾1,蒋红梅1,2,3

1. 兰州理工大学电气工程与信息工程学院，甘肃兰州 730050
2. 甘肃省工业过程先进控制重点实验室，甘肃兰州 730050
3. 兰州理工大学国家级电气与控制工程实验教学中心，甘肃兰州 730050

Image super-resolution reconstruction based on dynamic attention network

Xiao-qiang ZHAO1,2,3(

),Ze WANG1,Zhao-yang SONG1,Hong-mei JIANG1,2,3

1. School of Electrical Engineering and Information Engineering, Lanzhou University of Technology, Lanzhou 730050, China
2. Key Laboratory of Gansu Advanced Control for Industrial Processes, Lanzhou 730050, China
3. National Experimental Teaching Center of Electrical and Control Engineering, Lanzhou University of Technology, Lanzhou 730050, China

全文: PDF(1196 KB) HTML

摘要：

针对图像超分辨率算法在具有不同重要性的通道和空间域上采取相同的处理方式，导致计算资源无法集中利用到重要特征上的问题，提出基于动态注意力网络的图像超分辨率算法. 该算法改变了现有均等处理注意力机制的方式，通过构建的动态注意力模块对不同的注意力机制赋予动态学习的权重，以获取网络更需要的高频信息，重建高质量图片；通过特征重用的方式构建双蝶式结构，充分融合2个注意力分支的信息，弥补不同注意力机制间所缺失的特征信息. 在Set5、Set14、BSD100、Urban100和Manga109数据集上的模型评估结果表明，相较于其他主流超分辨率算法，本研究所提算法整体性能表现更好. 当放大因子为4时，相较于次优算法，所提算法在5个公开测试集上的峰值信噪比分别提升了0.06、0.07、0.04、0.15、0.15 dB.

关键词： 图像处理; 图像超分辨率; 注意力机制; 动态卷积; 双蝶式结构

Abstract:

The image super-resolution algorithm adopts the same processing mode in channels and spatial domains with different importance, which leads to the failure of computing resources to concentrate on important features. Aiming at the above problem, an image super-resolution algorithm based on dynamic attention network was proposed. Firstly, the existing way of equalizing attention mechanisms was changed, and dynamic learning weights were assigned to different attention mechanisms by constructed dynamic attention modules, by which high-frequency information more needed by the network was obtained and high-quality pictures were reconstructed. Secondly, the double butterfly structure was constructed through feature reuse , which fully integrated the information from the two branches of attention and compensated for the missing feature information between the different attention mechanisms. Finally, model evaluation was conducted on Set5, Set14, BSD100, Urban100 and Manga109 datasets. Results show that the proposed algorithm has better overall performance than other mainstream super-resolution algorithms. When the amplification factor was 4, compared with the sub-optimal algorithm, the peak signal-to-noise ratio values were improved by 0.06, 0.07, 0.04, 0.15 and 0.15 dB, respectively, on the above five public test sets.

Key words: image processing image super-resolution attention mechanism dynamic convolution double butterfly structure

收稿日期: 2022-10-10 出版日期: 2023-08-31

CLC:

TN 391

基金资助: 国家自然科学基金资助项目（62263021）；国家重点研发计划资助项目（2020YFB1713600）；甘肃省科技计划资助项目（21YF5GA072, 21JR7RA206）

作者简介: 赵小强（1969—），男，教授，从事故障诊断、图像处理、数据挖掘研究. orcid.org/0000-0001-5687-942X. E-mail： xqzhao@lut.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	赵小强
	王泽
	宋昭漾
	蒋红梅

引用本文:

赵小强,王泽,宋昭漾,蒋红梅. 基于动态注意力网络的图像超分辨率重建[J]. 浙江大学学报(工学版), 2023, 57(8): 1487-1494.

Xiao-qiang ZHAO,Ze WANG,Zhao-yang SONG,Hong-mei JIANG. Image super-resolution reconstruction based on dynamic attention network. Journal of ZheJiang University (Engineering Science), 2023, 57(8): 1487-1494.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.08.002 或 https://www.zjujournals.com/eng/CN/Y2023/V57/I8/1487

图 1 动态卷积框架图

图 2 动态注意力网络结构图

图 3 动态注意力模块结构图

图 4 2种注意力结构图

图 5 双蝶式结构

图 6 空间注意力结构

方法	倍数	参数量/K	PSNR(dB)/SSIM
方法	倍数	参数量/K	Set5	Set14	BSD100	Urban100	Manga109
Bicubic	2	—	33.68/0.9304	30.24/0.8691	29.56/0.8435	26.88/0.8405	30.80/0.9339
IMDN		694	38.00/0.9605	33.63/0.9177	32.19/0.8996	32.17/0.9283	38.88/0.977 4
MemNet		677	37.78/0.9597	33.28/0.9142	32.08/0.8978	31.31/0.9195	37.72/0.9740
CARN		1592	37.76/0.9590	33.52/0.9166	32.09/0.8978	31.92/0.9256	38.36/0.9765
EDSR-baseline		1370	37.99/0.9604	33.57/0.9175	32.16/0.8994	31.98/0.9272	38.45/0.9770
SRMDNF		1513	37.79/0.9600	33.32/0.9150	32.05/0.8980	31.33/0.9200	—
SeaNet-baseline		2102	37.99/0.9607	33.60/0.9174	32.18/0.8995	32.08/0.9276	38.48/0.9768
Cross-SRN		1296	38.03/0.9606	33.62/0.9180	32.19/0.8997	32.28/0.9290	38.75/0.9773
DAN(本研究)		1298	38.04/0.9608	33.64/0.9180	32.20/0.8998	32.26/0.9296	38.72/0.9773
Bicubic	3	—	30.93/0.8682	27.55/0.7742	27.21/0.7385	24.46/0.7349	26.95/0.8556
IMDN		703	34.36/0.9270	30.32/0.8417	29.09/0.8046	28.17/0.8519	33.61/0.9445
MemNet^]		677	34.09/0.9248	30.00/0.8350	28.96/0.8001	27.56/0.8376	32.51/0.9369
CARN		1592	34.29/0.9255	30.29/0.8407	29.06/0.8034	28.06/0.8493	33.50/0.9440
EDSR-baseline		1500	34.37/0.9270	30.28/0.8417	29.09/0.8052	28.15/0.8527	33.49/0.9438
SRMDNF		1530	34.12/0.9250	30.04/0.8370	28.97/0.8030	27.57/0.8400	—
SeaNet-baseline		2471	34.36/0.9280	30.34/0.8428	29.09/0.8053	28.17/0.8527	33.40/0.9444
Cross-SRN		1296	34.43/0.9275	30.33/0.8417	29.09/0.8050	28.23/0.8535	33.65/0.9448
DAN(本研究)		1326	34.42/0.9274	30.38/0.8429	29.10/0.8052	28.24/0.8542	33.63/0.9446
Bicubic	4	—	28.42/0.8104	26.00/0.7027	26.96/0.6675	23.14/0.6577	24.80/0.7866
IMDN		715	32.21/0.8948	28.58/0.7811	27.56/0.7353	26.04/0.7838	30.45/0.9075
MemNet		677	31.74/0.8893	28.26/0.7723	27.40/0.7281	25.50/0.7630	29.42/0.8942
CARN		1592	32.13/0.8937	28.60/0.7806	27.58/0.7349	26.07/0.7837	30.47/0.9087
EDSR-baseline		1500	32.09/0.8938	28.58/0.7813	27.57/0.7357	26.04/0.7849	30.45/0.9082
SRMDNF		1555	31.96/0.8930	28.35/0.7770	27.49/0.7340	25.68/0.7730	—
SeaNet-baseline		2397	32.18/0.8948	28.61/0.7822	27.57/0.7359	26.05/0.7896	30.44/0.9088
Cross-SRN		1296	32.24/0.8954	28.59/0.7817	27.58/0.7364	26.16/0.7881	30.53/0.9081
DAN(本研究)		1337	32.32/0.8962	28.68/0.7841	27.62/0.7381	26.31/0.7936	30.68/0.9106

表 1 不同SR算法在放大倍数为2、3、4时的平均PSNR与SSIM

图 7 标准测试集下4倍放大倍数下的视觉效果比较

1	SI W, HAN J, YANG Z, et al. Research on key techniques for super-resolution reconstruction of satellite remote sensing images of transmission lines [C]// Journal of Physics: Conference Series. Sanya: ICAACE, 2021: 012092.
2	DEEBA F, KUN S, DHAREJO F A, et al Sparse representation based computed tomography images reconstruction by coupled dictionary learning algorithm[J]. IET Image Processing, 2020, 14 (11): 2365- 2375 doi: 10.1049/iet-ipr.2019.1312
3	ZHANG F, LIU N, CHANG L, et al Edge-guided single facial depth map super-resolution using CNN[J]. IET Image Processing, 2020, 14 (17): 4708- 4716 doi: 10.1049/iet-ipr.2019.1623
4	LI W, LIAO W Stable super-resolution limit and smallest singular value of restricted Fourier matrices[J]. Applied and Computational Harmonic Analysis, 2021, 51: 118- 156 doi: 10.1016/j.acha.2020.10.004
5	吴世豪, 罗小华, 张建炜, 等基于FPGA的新边缘指导插值算法硬件实现[J]. 浙江大学学报: 工学版, 2018, 52 (11): 2226- 2232 WU Shi-hao, LUO Xiao-hua, ZHANG Jian-wei, et al FPGA-based hardware implementation of new edge-directed interpolation algorithm[J]. Journal of Zhejiang University: Engineering Science, 2018, 52 (11): 2226- 2232
6	段然, 周登文, 赵丽娟, 等基于多尺度特征映射网络的图像超分辨率重建[J]. 浙江大学学报: 工学版, 2019, 53 (7): 1331- 1339 DUAN Ran, ZHOU Deng-wen, ZHAO Li-juan, et al Image super-resolution reconstruction based on multi-scale feature mapping network[J]. Journal of Zhejiang University: Engineering Science, 2019, 53 (7): 1331- 1339
7	DONG C, LOY C C, HE K, et al. Learning a deep convolutional network for image super-resolution [C]// European Conference on Computer Vision. Zurich: ECCV, 2014: 184-199.
8	DONG C, LOY C C, HE K, et al Image super-resolution using deep convolutional networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38 (2): 295- 307
9	LIM B, SON S, KIM H, et al. Enhanced deep residual networks for single image super-resolution [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu: CVPRW, 2017: 136-144.
10	TAI Y, YANG J, LIU X, et al. Memnet: a persistent memory network for image restoration [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: ICCV, 2017: 4539-4547.
11	AHN N, KANG B, SOHN K A. Fast, accurate, and lightweight super-resolution with cascading residual network [C]// Proceedings of the European Conference on Computer Vision. Munich: ECCV, 2018: 252-268.
12	WANG C, LI Z , SHI J. Lightweight image super-resolution with adaptive weighted learning network [EB/OL]. [2019-04-04]. https://arxiv.org/abs/1904.02358.
13	WOO S, PARK J, LEE J Y, et al. Cbam: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision. Munich: ECCV, 2018: 3-19.
14	DAI T, CAI J, ZHANG Y, et al. Second-order attention network for single image super-resolution [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: CVPR, 2019: 11065-11074.
15	ZHANG Y, LI K, LI K, et al. Residual non-local attention networks for image restoration[EB/OL]. [2019-03-24]. https://arxiv.org/abs/1903.10082.
16	JIA X, BRABANDERE D B, TUYTELAARS T, et al. Dynamic filter networks for predicting unobserved views [C]// Proceedings of the European Conference on Computer Vision 2016 Workshops. Amsterdam: ECCVW, 2016: 1-2.
17	YANG B, BENDER G, LE Q V, et al. Condconv: conditionally parameterized convolutions for efficient inference [C]// Advances in Neural Information Processing Systems. 2019, 32: 767-779.
18	CHEN Y, DAI X, LIU M, et al. Dynamic convolution: attention over convolution kernels [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: CVPR, 2020: 11030-11039.
19	ZHANG Y, ZHANG J, WANG Q, et al. Dynet: dynamic convolution for accelerating convolutional neural networks [EB/OL]. [2020-04-22]. https://arxiv.org/abs/2004.10694.
20	ZHANG Y, LI K, LI K, et al. Image super-resolution using very deep residual channel attention networks [C]// Proceedings of the European Conference on Computer Vision. Munich: ECCV, 2018: 286-301.
21	CHEN H, GU J, ZHANG Z. Attention in attention network for image super-resolution [EB/OL]. [2021-04-19]. https://arxiv.org/abs/2104.09497.
22	TIMOFTE R, AGUSTSSON E, VAN G L, et al. Ntire 2017 challenge on single image super-resolution: methods and results [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Hawaii: CVPRW, 2017: 114-125.
23	BEVILACQUA M, ROUMY A, GUILLEMOT C, et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding [C]// Proceedings British Machine Vision Conference. Surrey: Springer, 2012: 1-10.
24	ZEYDE R, ELAD M, PROTTER M. On single image scale-up using sparse-representations [C]// International Conference on Curves and Surfaces. Avignon: ICCS, 2010: 711-730.
25	MARTIN D, FOWLKES C, TAL D, et al. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics [C]// Proceedings 18th IEEE International Conference on Computer Vision. Vancouver: ICCV, 2001: 416-423.
26	HUANG J B, SINGH A, AHUJA N. Single image super-resolution from transformed self-exemplars [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Santiago: IEEE, 2015: 5197-5206.
27	MATSUI Y, ITO K, ARAMAKI Y, et al Sketch-based manga retrieval using manga109 dataset[J]. Multimedia Tools and Applications, 2017, 76 (20): 21811- 21838 doi: 10.1007/s11042-016-4020-z
28	FEI Y, LIAN F H, YAN Y. An improved PSNR algorithm for objective video quality evaluation [C]// 2007 Chinese Control Conference. Zhangjiajie: CCC, 2007: 376-380.
29	WANG Z, BOVIK A C, SHEIKH H R, et al Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13 (4): 600- 612 doi: 10.1109/TIP.2003.819861
30	KINGMA D P, BA J. Adam: a method for stochastic optimization [EB/OL]. [2014-12-22]. https://arxiv.org/abs/1412.6980.
31	HUI Z, GAO X, YANG Y, et al. Lightweight image super-resolution with information multi-distillation network [C]// Proceedings of the 27th ACM International Conference on Multimedia. Ottawa: ACM, 2019: 2024-2032.
32	FANG F, LI J, ZENG T Soft-edge assisted network for single image super-resolution[J]. IEEE Transactions on Image Processing, 2020, 29: 4656- 4668 doi: 10.1109/TIP.2020.2973769
33	LIU Y, JIA Q, FAN X, et al Cross-srn: structure-preserving super-resolution network with cross convolution[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 32 (8): 4927- 4939

[1]	郭浩然,郭继昌,汪昱东. 面向水下场景的轻量级图像语义分割网络[J]. 浙江大学学报(工学版), 2023, 57(7): 1278-1286.
[2]	李晓艳,王鹏,郭嘉,李雪,孙梦宇. 基于双注意力机制的多分支孪生网络目标跟踪[J]. 浙江大学学报(工学版), 2023, 57(7): 1307-1316.
[3]	权巍,蔡永青,王超,宋佳,孙鸿凯,李林轩. 基于3D-ResNet双流网络的VR病评估模型[J]. 浙江大学学报(工学版), 2023, 57(7): 1345-1353.
[4]	韩俊,袁小平,王准,陈烨. 基于YOLOv5s的无人机密集小目标检测算法[J]. 浙江大学学报(工学版), 2023, 57(6): 1224-1233.
[5]	吕鑫栋,李娇,邓真楠,冯浩,崔欣桐,邓红霞. 基于改进Transformer的结构化图像超分辨网络[J]. 浙江大学学报(工学版), 2023, 57(5): 865-874.
[6]	项学泳,王力,宗文鹏,李广云. ASIS模块支持下融合注意力机制KNN的点云实例分割算法[J]. 浙江大学学报(工学版), 2023, 57(5): 875-882.
[7]	熊帆,陈田,卞佰成,刘军. 基于卷积循环神经网络的芯片表面字符识别[J]. 浙江大学学报(工学版), 2023, 57(5): 948-956.
[8]	苏育挺,陆荣烜,张为. 基于注意力和自适应权重的车辆重识别算法[J]. 浙江大学学报(工学版), 2023, 57(4): 712-718.
[9]	卞佰成,陈田,吴入军,刘军. 基于改进YOLOv3的印刷电路板缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(4): 735-743.
[10]	程艳芬,吴家俊,何凡. 基于关系门控图卷积网络的方面级情感分析[J]. 浙江大学学报(工学版), 2023, 57(3): 437-445.
[11]	曾耀,高法钦. 基于改进YOLOv5的电子元件表面缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(3): 455-465.
[12]	杨帆,宁博,李怀清,周新,李冠宇. 基于语义增强特征融合的多模态图像检索模型[J]. 浙江大学学报(工学版), 2023, 57(2): 252-258.
[13]	刘超,孔兵,杜国王,周丽华,陈红梅,包崇明. 高阶互信息最大化与伪标签指导的深度聚类[J]. 浙江大学学报(工学版), 2023, 57(2): 299-309.
[14]	王林涛,毛齐. 基于RGB与深度信息融合的管片抓取位置测量方法[J]. 浙江大学学报(工学版), 2023, 57(1): 47-54.
[15]	凤丽洲,杨阳,王友卫,杨贵军. 基于Transformer和知识图谱的新闻推荐新方法[J]. 浙江大学学报(工学版), 2023, 57(1): 133-143.

Viewed

Full text

Abstract

Cited

Shared

Discussed