JIANG Wen ting1,2, GONG Xiao jin1,2, LIU Ji lin1,2
1. Department of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China;2. Zhejiang Provincial Key Laboratory of Information Network Technology, Hangzhou 310027, China
In order to efficiently achieve accurate large scale scene understanding result, A new large scale dense semantic mapping system was proposed. The system constructed a map by incrementally calculating with a conditional random field model. The method used stereo visual odometry to get the motion of the camera, and used the labeled image sequences to build semantic map. The key point was to incrementally build the semantic map which detected newly built voxels, over segment the points within these voxels into supervoxels, labeled these supervoxels under the guidance of neighboring frames and used the rigid transformation matrix to fuse the newly labeled points with the already built map. A conditional random field model was constructed which took labeling results of sequential frames as the data term, took the coherent labeling constraint between neighboring supervoxels as the pairwise term and solved the model by graph cut. Experimental evaluations show that the approach can get an accurate large scale semantic map and decrease computational cost, The approach can improve the labeling results at image level.
JIANG Wen ting, GONG Xiao jin, LIU Ji lin. Incremental large scale dense semantic mapping. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2016, 50(2): 385-391.
[1] BAILEY T, DURRANT WHYTE H. Simultaneous localization and mapping(SLAM):Part II [J]. Robotics &Automation Magazine, 2006, 13(3): 108-117.
[2] SENGUPTA S, GREVESON E, SHAHROKNI A, et al. Urban 3dsemantic modelling using stereo vision [C]∥ Computer Vision ICRA. Karlsruhe: IEEE, 2013: 580-585.
[3] 谭光华. 三维几何模型的形状编辑技术研究[D].杭州:浙江大学, 2009: 3-17.
TAN Guang hua. Studies on shape editing techniquesof3d geometric models. Hangzhou: Zhejiang University, 2009: 3-17.
[4] HE Hu, UPCROFT B. Nonparametric semantic segmentation for 3dstreet scenes [C]∥IEEE IROS. Tokyo: IEEE,2013: 3697-3703.
[5] KUNDU A, LI Y, DELLAERT F, LI F, et al. Joint semantic segmentation and 3d reconstruction from monocular video [C]∥Computer Vision ECCV. Zurich: Springer, 2014: 703-718.
[6] PAPON J, ABRAMOV A, SCHOELER M, WORGOTTER F. Voxel cloud connectivity segmentation supervoxels for point clouds [C]∥IEEE CVPR. Portland, OR: IEEE, 2013: 2027-2034.
[7] BOYKOV Y, JOLLY M. Interactive graph cuts for optimal boundary& region segmentation of objects in nd images [C]∥IEEE ICCV. Vancouver, BC: IEEE, 2001: 105-112.
[8] LU W, XIANG Z, LIU J. High performance visual odometry with two stage local binocular [C]∥IEEEIV. Gold Coast, QLD: IEEE, 2013: 1107-1112.
[9] TRIGGS B, MCLAUCHLAN P F, HARTLEY R I, FITZGIBBON A W. Bundle adjustment — a modern synthesis [C]∥Vision algorithms: theory and practice. Corfu, Greece: Springer, 2000: 298-372.
[10] GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? the kitti vision benchmark suite [C]∥IEEE CVPR. Providence, RI: Springer, 2012: 3354-3361.
[11] HUANG Wen qi, GONG Xiao jin. Fusion based holistic road scene understanding [EB/OL]. (2014 06 29) [2015 07 22] http: ∥arxiv.org/pdf/1406.7525.pdf
[12] LIU Jun yi, GONG Xiao jin. Guided depth enhancement via an isotropic diffusion[C]∥ Advances in Multimedia Information Processing(PCM). Nanjing, China: Springer, 2013: 408-417.
[13] GOULD S, FULTON R, KOLLER D. Decomposing a scene into geometric and semantically consistent regions [C]∥IEEE ICCV. Kyoto: IEEE, 2009: 1-8.
[14] HORNUNG A, WURM K M, BENNEWITZ M, STACHNISS C, et al. OctoMap: an efficient probabilistic 3d mapping framework based on octrees [J]. Autonomous Robots, 2013, 34(3): 189-206.