Please wait a minute...
JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE)  2019, Vol. 53 Issue (1): 115-125    DOI: 10.3785/j.issn.1008-973X.2019.01.013
Computer Technology     
Image restoration and fault tolerance of stereo SLAM based on generative adversarial net
WANG Kai, YUE Bo-xuan, FU Jun-wei, LIANG Jun
College of Control Science and Engineering, Zhejiang University, Hangzhou 310058, China
Download:   PDF(2045KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

The classical Pix2Pix network was modified in order to promote the capacity of fault tolerance of simultaneous localization and mapping (SLAM) system. The network was gradually added to depth estimation network and its depth information, image reconstruction loss based on STN network and image inpainting loss based on image inpainting network. Information was mined based on the coupling of stereo images and merged to utilize information usage and promote model performance. Then generative adversarial net (GAN) and SLAM were combined, and the fault tolerance in the sensing level was directly realized. Experiments were performed on KITTI and Cityscapes dataset in order to prove the effectiveness of the improvement. The generated images and original images were both fed as inputs of stereo SLAM system. Results showed that the fault tolerance idea was approachable.



Received: 10 January 2018      Published: 07 January 2019
CLC:  TP183  
Cite this article:

WANG Kai, YUE Bo-xuan, FU Jun-wei, LIANG Jun. Image restoration and fault tolerance of stereo SLAM based on generative adversarial net. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2019, 53(1): 115-125.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2019.01.013     OR     http://www.zjujournals.com/eng/Y2019/V53/I1/115


基于生成对抗网络的图像恢复与SLAM容错研究

为了提高即时定位与地图构建(SLAM)系统的容错能力,在经典图像生成网络Pix2Pix的基础上,逐步添加深度估计网络和深度信息的输入、基于STN网络的图像重建损失以及基于图像修复网络的图像补全损失3个方面的改进. 结合双目图像的耦合关系,通过挖掘和融合多种信息,增大了信息的利用率,提高了模型的图像生成效果. 提出将生成对抗网络(GAN)技术与SLAM容错场景相结合,直接实现了感知端的容错. 在KITTI和Cityscapes数据集上进行实验,验证了改进模型的有效性. 将模型生成的图像用于双目视觉系统的重建,验证了容错思想的可行性.

[1] SHUM H, KANG S B. Review of image-based rendering techniques[C]//Visual Communications and Image Processing. Perth:International Society for Optics and Photonics, 2000, 4067:2-14.
[2] TATARCHENKO M, DOSOVITSKIY A, BROX T. Multi-view 3D models from single images with a convolutional network[J]. Knowledge and Information Systems, 2015, 38(1):231-257.
[3] DAVISON A J. Real-time simultaneous localisation and mapping with a single camera[C]//Proceedings Ninth IEEE International Conference on Computer Vision. Nice:IEEE, 2003:1403-1410.
[4] MUR-ARTAL R, MONTIEL J M M, TARDOS J D. ORB-SLAM:a versatile and accurate monocular SLAM system[J]. IEEE Transactions on Robotics, 2015, 31(5):1147-1163.
[5] DURRANTWHYTE H F, BAILEY T. Simultaneous localization and mapping[J]. IEEE Robotics Automat Mag, 2006, 13(3):108-117.
[6] LEMAIRE T, BERGER C, JUNG I K, et al. Vision-based SLAM:stereo and monocular approaches[J]. International Journal of Computer Vision, 2007, 74(3):343-364.
[7] JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks[C]//Advances in Neural Information Processing Systems. Montreal:[s. n.], 2015:2017-2025.
[8] TATARCHENKO M, DOSOVITSKIY A, BROX T. Single-view to multi-view:reconstructing unseen views with a convolutional network[J]. Knowledge and Information Systems, 2015, 38(1):231-257.
[9] ZHAO B, WU X, CHENG Z Q, et al. Multi-view image generation from a single-view[C]//Proceedings of the 26th ACM International Conference on Multimedia. Seoul:ACM, 2018:383-391.
[10] ZHOU T, TULSIANI S, SUN W, et al. View synthesis by appearance flow[C]//European Conference on Computer Vision. Cham:Springer, 2016:286-301.
[11] PARK E, YANG J, YUMER E, et al. Transformation-grounded image generation network for novel 3d view synthesis[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu:IEEE, 2017:702-711.
[12] EIGEN D, PUHRSCH C, FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C]//Advances in Neural Information Processing Systems. Montreal:MIT Press, 2014:2366-2374.
[13] SHI J, POLLEFEYS M. Pulling things out of perspective[C]//IEEE Conference on Computer Vision and Pattern Recognition. Ohio:IEEE, 2014:89-96.
[14] LIU F, SHEN C, LIN G, et al. Learning depth from single monocular images using deep convolutional neural fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10):2024-2039.
[15] ABRAMS A, HAWLEY C, PLESS R. Heliometric stereo:shape from sun position[C]//Computer Vision-ECCV 2012. Berlin:Springer, 2012:357-370.
[16] FURUKAWA Y, HERNÁNDEZ C. Multi-view stereo:a tutorial[J]. Foundations and Trends® in Computer Graphics and Vision, 2015, 9(1/2):1-148.
[17] RANFTL R, VINEET V, CHEN Q, et al. Dense monocular depth estimation in complex dynamic scenes[C]//Computer Vision and Pattern Recognition. Las Vegas:IEEE, 2016.
[18] SCHARSTEIN D, SZELISKI R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms[J]. International Journal of Computer Vision, 2002, 47(1-3):7-42.
[19] WOODHAM R J. Photometric method for determining surface orientation from multiple images[J]. Optical Engineering, 1980, 19(1):1-22.
[20] GODARD C, MAC AODHA O, BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C]//Computer Vision and Pattern Recognition. Honolulu:IEEE, 2017, 2(6):7.
[21] ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu:IEEE, 2017:5967-5976.
[22] RONNEBERGER O, FISCHER P, BROX T. U-Net:convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham:Springer, 2015:234-241.
[23] WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment:from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4):600-612.
[24] PATHAK D, KRAHENBUHL P, DONAHUE J, et al. Context encoders:feature learning by inpainting[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas:IEEE, 2016:2536-2544.
[25] WANG Y, LI J, LU Y, et al. Image quality evaluation based on image weighted separating block peak signal to noise ratio[C]//International Conference on Neural Networks and Signal Processing. Nanjing:IEEE, 2003:994-997.

[1] GUO Bao-zhen, ZUO Wan-li, WANG Ying. Double CNN sentence classification model with attention mechanism of word embeddings[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(9): 1729-1737.
[2] CHEN Xing-yu, HUANG Shan-he, He Hao-zhe. Measurement error due to frequency selection in multi-frequency suspended sediment measurement system[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(2): 307-316.