Please wait a minute...
浙江大学学报(工学版)  2022, Vol. 56 Issue (1): 63-74    DOI: 10.3785/j.issn.1008-973X.2022.01.007
计算机技术、信息与电子工程     
基于生成对抗模型的可见光-红外图像匹配方法
陈彤1(),郭剑锋1,韩心中2,谢学立1,席建祥1,*()
1. 火箭军工程大学 导弹工程学院, 陕西 西安 710000
2. 中国人民解放军96901部队, 北京 100094
Visible and infrared image matching method based on generative adversarial model
Tong CHEN1(),Jian-feng GUO1,Xin-zhong HAN2,Xue-li XIE1,Jian-xiang XI1,*()
1. Department of Missile Engineering, Rocket Force University of Engineering, Xi’an 710000, China
2. The 96901 Unit of the Chinese People’s Liberation Army, Beijing 100094, China
 全文: PDF(2133 KB)   HTML
摘要:

针对现有异源图像匹配存在的模态差异大、匹配难度大、鲁棒性差等问题, 基于生成对抗网络转换思想及传统的局部特征提取能力, 提出基于生成对抗模型的可见光-红外图像匹配方法. 依据生成对抗网络(GAN)的风格转换思想, 增加了损失函数计算通路并构建新的损失函数, 改进模型在异源图像上的转换效果. 利用SIFT算法分别提取转换后同源图像的特征信息, 确定待匹配点的位置和尺度. 依据匹配策略间接完成待配准图像的特征匹配及相似性度量. 在实景航拍数据集上进行实验验证. 结果表明, 利用该方法能够有效地处理多模数据, 降低异源图像的匹配难度, 为多模态图像匹配问题提供新的思路.

关键词: 航拍图像处理异源图像匹配深度学习生成对抗网络(GAN)风格转换    
Abstract:

A visible-infrared image matching method based on generative adversarial model was proposed based on the style transfer of generative adversarial network and traditional local feature extraction capability in order to analyze the problems of large modal difference, difficult matching and poor robustness of existing multi-sensor images matching methods. The loss function calculation path was added and a new loss function was constructed according to the idea of style transfer in GAN network in order to improve the transfer effect of the model on the multi-sensor images. The feature information of the transformed homologous images was extracted by using SIFT algorithm. Then the position and scale of the points to be matched were determined. The feature matching and similarity measurement between the two images were indirectly completed according to the matching strategy. Experiments were conducted on the realistic aerial dataset. Results show that the proposed method can effectively deal with multi-modal data and reduce the difficulty of multi-sensor image matching. The method can provide a new solution for multi-sensor images matching.

Key words: aerial image processing    multi-sensor images matching    deep learning    generative adversarial network (GAN)    style transfer
收稿日期: 2021-06-19 出版日期: 2022-01-05
CLC:  TP 391  
基金资助: 国家自然科学基金资助项目(62176263, 62103434); 陕西省杰出青年科学基金资助项目(2021JC-35); 中国博士后科学基金特别资助项目(2021T140790)
通讯作者: 席建祥     E-mail: chentong00817@163.com;xijx07@mails.tsinghua.edu.cn
作者简介: 陈彤(1997—), 女, 硕士生, 从事图像处理的研究. orcid.org/0000-0001-6804-8578. E-mail: chentong00817@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
陈彤
郭剑锋
韩心中
谢学立
席建祥

引用本文:

陈彤,郭剑锋,韩心中,谢学立,席建祥. 基于生成对抗模型的可见光-红外图像匹配方法[J]. 浙江大学学报(工学版), 2022, 56(1): 63-74.

Tong CHEN,Jian-feng GUO,Xin-zhong HAN,Xue-li XIE,Jian-xiang XI. Visible and infrared image matching method based on generative adversarial model. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 63-74.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.01.007        https://www.zjujournals.com/eng/CN/Y2022/V56/I1/63

图 1  循环回路结构图
图 2  CycleGAN模型生成器、判别器结构示意图
图 3  损失函数关系示意图
相关系数 相关强度
0.8~1.0 极强相关
0.6~0.8 强相关
0.4~0.6 中等程度相关
0.2~0.4 弱相关
0~0.2 极弱相关
表 1  相关系数与相关强度的对应关系
图 4  基于生成对抗模型的可见光−红外图像匹配方法流程图
参数 参数值
b 1
e 100
r 0.0002
优化器 Adam
图像尺寸 256×256
归一化 实例归一化
α 30
β 0.5
表 2  模型基本参数的设定
图 5  不同权值下的PCCs趋势图
图 6  不同场景下的训练样本
图 7  训练模型的损失曲线图
表 3  不同批次下的训练结果图
$ \alpha $ $ e $ PCCs
场景 Ⅰ 场景 Ⅱ
10 30 0.137 9 ?0.032 9
10 50 0.206 7 0.156 1
10 100
0.182 5
0.259 3
30 30 0.203 6 0.262 2
30 50 0.361 1 0.450 6
30 100 0.560 1 0.603 9
表 4  2种场景下不同批次的PCCs
类别 名称 配置参数
硬件 CPU/GPU CPU:Intel?Core i9 9980 XE
Intel 1TSSDm.2, 4T
RAM:64G/3000
GPU:RTX2080TI/11G*2
软件 操作系统 64位Windows10
深度学习框架 Pycharm2020+Pytorch1.6
表 5  可见光-红外图像转换及匹配实验环境
图 8  实景航拍图像的转换结果
图 9  可见光-红外图像直接匹配的结果
场景 $T/{\rm{s}}$ $ {K_1} $ $ {K_2} $ $ P $ ${R_{\rm{P} } }/{\text{%}}$
场景1 0.797 152 277 5 2.33
场景2 0.954 149 234 13 6.79
场景3 1.031 320 204 11 4.20
表 6  SIFT算法下3幅场景图像的数值结果
图 10  不同算法下图像转换后的匹配效果
算法 场景 $T/{\rm{s}}$ $ {K_1} $ $ {K_2} $ $ P $ ${R_{\rm{P} } }/{\text{%}}$
SIFT 1 1.376 451 444 193 43.13
SIFT 2 1.028 471 416 174 39.23
SIFT 3 1.387 520 524 150 28.74
SURF 1 0.849 563 563 226 40.14
SURF 2 0.900 530 512 213 40.88
SURF 3 1.879 476 473 112 23.60
ORB 1 0.192 465 619 210 38.75
ORB 2 0.145 486 346 106 25.48
ORB 3 0.276 521 520 115 22.09
表 7  不同算法下3组场景图像的对比结果
图 11  不同时间段3组场景的匹配效果
场景 $T/{\rm{s}}$ $ {K_1} $ $ {K_2} $ $ P $ ${R_{\rm{P} } }/ {\text{%} }$
1 0.846 416 426 163 38.72
2 1.277 408 353 156 40.99
3 1.105 420 425 96 22.72
表 8  不同时间段3组场景图像匹配对比结果
模型 PSNR/dB SSIM NRMSE IE
真实图像 8.91 0.12 0.58 7.01
CycleGAN 16.83 0.47 0.31 7.03
CycleGAN+ $ {L_{{\text{identity}}}} $ 22.48 0.63 0.18 7.04
表 9  不同模型的性能指标对比结果
1 BROWN L G A survey of image registration techniques[J]. ACM Computer Surveys (CSUR), 1992, 24 (4): 325- 376
doi: 10.1145/146370.146374
2 BARBARA Z, FLUSSER J Image registration methods: a survey[J]. Image and Vision Computing, 2003, 21 (11): 977- 1000
doi: 10.1016/S0262-8856(03)00137-9
3 XIONG Z, ZHANG Y A critical review of image registration methods[J]. International Journal of Image and Data Fusion, 2010, 1 (2): 137- 158
doi: 10.1080/19479831003802790
4 SHI W, SU F, WANG R, et al. A visual circle based image registration algorithm for optical and SAR imagery [C]// IEEE International Geoscience and Remote Sensing Symposium. Munich: IEEE, 2012: 2109-2112.
5 SURI S, REINARTZ P Mutual-information-based registration of terraSAR-X and ikonos imagery in urban areas[J]. IEEE Transactions on Geoscience and Remote Sensing, 2010, 48 (2): 939- 949
doi: 10.1109/TGRS.2009.2034842
6 HASAN M, PICKERING M R, JIA X Robust automatic registration of multimodal satellite images using CCRE with partial volume interpolation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2012, 50 (10): 4050- 4061
doi: 10.1109/TGRS.2012.2187456
7 高雪艳, 潘安宁, 杨扬 基于图像混合特征的城市绿地遥感图像配准[J]. 浙江大学学报: 工学版, 2019, 53 (6): 1205- 1217
GAO Xue-yan, PAN An-ning, YANG Yang Urban green space remote sensing image registration using image mixed features[J]. Journal of Zhejiang University: Engineering Science, 2019, 53 (6): 1205- 1217
doi: 10.3785/j.issn.1008-973X.2019.06.021
8 张莉, 刘济林 多摄像机间基于最稳定极值区域的人体跟踪方法[J]. 浙江大学学报: 工学版, 2010, 44 (6): 1091- 1097
ZHANG Li, LIU Ji-Lin Human tracking method based on maximally stable extremal regions with multi cameras[J]. Journal of Zhejiang University: Engineering Science, 2010, 44 (6): 1091- 1097
doi: 10.3785/j.issn.1008-973X.2010.06.007
9 REN S Q, HE K M, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149
doi: 10.1109/TPAMI.2016.2577031
10 LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2117-2125.
11 胡惠雅, 盖绍彦, 达飞鹏 基于生成对抗网络的偏转人脸转正[J]. 浙江大学学报: 工学版, 2021, 55 (1): 116- 123
HU Hui-ya, GAI Shao-yan, DA Fei-peng Face frontalization based on generative adversarial network[J]. Journal of Zhejiang University: Engineering Science, 2021, 55 (1): 116- 123
12 GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al Generative adversarial networks[J]. Communications of the ACM, 2020, 63 (11): 139- 144
doi: 10.1145/3422622
13 DENTON E L, CHINTALA S, SZLAM A, et al Deep generative image models using a laplacian pyramid of adversarial networks[J]. Computer Vision and Pattern Recognition, 2015, 6 (1): 1486- 1494
14 ZHU J Y, PHILIPP K, SHECHTMAN E, et al. Generative visual manipulation on the natural image manifold [C]// European Conference on Computer Vision. Cham: Springer, 2016: 597-613.
15 LI C, WAND M. Precomputed real-time texture synthesis with markovian generative adversarial networks [C]// European Conference on Computer Vision. Cham: Springer, 2016: 702-716.
16 唐贤伦, 杜一铭, 刘雨微, 等 基于条件深度卷积生成对抗网络的图像识别方法[J]. 自动化学报, 2018, 44 (5): 855- 864
TANG Xian-lun, DU Yi-ming, LIU Yu-wei, et al Image recognition with conditional deep convolutional generative adversarial networks[J]. Acta Automatica Sinica, 2018, 44 (5): 855- 864
17 RADFORD A, METZ L, CHINTALA S, et al. Unsupervised representation learning with deep convolutional generative adversarial networks [C]// International Conference of Legal Regulators. Washington: [s. n.], 2016: 1-16.
18 ARJOVSKY M, CHINTALA S, BOTTOU L, et al. Wasserstein generative adversarial networks [C]// International Conference on Machine Learning. Sydney: [s. n. ], 2017: 214-223.
19 ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2223-2232.
20 HUANG X, LIU M Y, BELONGIE S, et al. Multimodal unsupervised image-to-image translation [C]// European Conference on Computer Vision. Cham: Springer, 2018: 172-189.
21 LEE H Y, TSENG H Y, HUANG J B, et al. Diverse image-to-image translation via disentangled representations [C]// European Conference on Computer Vision. Cham: Springer, 2018: 35-51.
22 CHANG H Y, WANG Z, CHUANG Y Y. Domain-specific mappings for generative adversarial style transfer [C]// European Conference on Computer Vision. Cham: Springer, 2020: 573-589.
23 SONG L, ZHANG M, WU X, et al. Adversarial discriminative heterogeneous face recognition [C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans: AAAI, 2018: 7355-7362.
24 MERKLE N, AUER S, MULLER R, et al Exploring the potential of conditional adversarial networks for optical and SAR image matching[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11 (6): 1811- 1820
doi: 10.1109/JSTARS.2018.2803212
25 胡麟苗, 张湧 基于生成对抗网络的短波红外-可见光人脸图像翻译[J]. 光学学报, 2020, 40 (5): 0510001
HU Lin-miao, ZHANG Yong Facial image translation in short-wavelength infrared and visible light based on generative adversarial network[J]. Acta Optica Sinica, 2020, 40 (5): 0510001
doi: 10.3788/AOS202040.0510001
[1] 何立,庞善民. 结合年龄监督和人脸先验的语音-人脸图像重建[J]. 浙江大学学报(工学版), 2022, 56(5): 1006-1016.
[2] 张雪芹,李天任. 基于Cycle-GAN和改进DPN网络的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2022, 56(4): 727-735.
[3] 褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[4] 程若然,赵晓丽,周浩军,叶翰辰. 基于深度学习的中文字体风格转换研究综述[J]. 浙江大学学报(工学版), 2022, 56(3): 510-519, 530.
[5] 任松,朱倩雯,涂歆玥,邓超,王小书. 基于深度学习的公路隧道衬砌病害识别方法[J]. 浙江大学学报(工学版), 2022, 56(1): 92-99.
[6] 刘兴,余建波. 注意力卷积GRU自编码器及其在工业过程监控的应用[J]. 浙江大学学报(工学版), 2021, 55(9): 1643-1651.
[7] 陈雪云,黄小巧,谢丽. 基于多尺度条件生成对抗网络血细胞图像分类检测方法[J]. 浙江大学学报(工学版), 2021, 55(9): 1772-1781.
[8] 刘嘉诚,冀俊忠. 基于宽度学习系统的fMRI数据分类方法[J]. 浙江大学学报(工学版), 2021, 55(7): 1270-1278.
[9] 金立生,华强,郭柏苍,谢宪毅,闫福刚,武波涛. 基于优化DeepSort的前方车辆多目标跟踪[J]. 浙江大学学报(工学版), 2021, 55(6): 1056-1064.
[10] 许佳辉,王敬昌,陈岭,吴勇. 基于图神经网络的地表水水质预测模型[J]. 浙江大学学报(工学版), 2021, 55(4): 601-607.
[11] 王虹力,郭斌,刘思聪,刘佳琪,仵允港,於志文. 边端融合的终端情境自适应深度感知模型[J]. 浙江大学学报(工学版), 2021, 55(4): 626-638.
[12] 张腾,蒋鑫龙,陈益强,陈前,米涛免,陈彪. 基于腕部姿态的帕金森病用药后开-关期检测[J]. 浙江大学学报(工学版), 2021, 55(4): 639-647.
[13] 徐利锋,黄海帆,丁维龙,范玉雷. 基于改进DenseNet的水果小目标检测[J]. 浙江大学学报(工学版), 2021, 55(2): 377-385.
[14] 陈涵娟,达飞鹏,盖绍彦. 基于竞争注意力融合的深度三维点云分类网络[J]. 浙江大学学报(工学版), 2021, 55(12): 2342-2351.
[15] 胡惠雅,盖绍彦,达飞鹏. 基于生成对抗网络的偏转人脸转正[J]. 浙江大学学报(工学版), 2021, 55(1): 116-123.