基于增量计算的大规模场景致密语义地图构建

doi:10.3785/j.issn.1008-973X.2016.02.026

浙江大学学报(工学版)

电信技术

基于增量计算的大规模场景致密语义地图构建

江文婷1,2, 龚小谨1,2, 刘济林1,2

1. 浙江大学信息与电子工程学系,浙江杭州310027；2. 浙江省综合信息网技术重点实验室,浙江杭州 310027

Incremental large scale dense semantic mapping

JIANG Wen ting1,2, GONG Xiao jin1,2, LIU Ji lin1,2

1. Department of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China；2. Zhejiang Provincial Key Laboratory of Information Network Technology, Hangzhou 310027, China

全文: PDF(2045 KB) HTML

摘要：

为了准确而高效地进行大规模场景理解,提出基于增量计算的条件随机场下的大规模场景致密语义地图构建方法.该方法利用双目视觉估算相机运动轨迹,根据图像序列语义标注结果构建语义地图.递增的语义地图的构建过程是关键,需要检测致密化处理后的输入帧相较于前一帧的新增体素,对新增体素内部三维点过分割成超体素,利用前后多帧的标注结果指导超体素的标注,如此逐帧地将新增体素融合到语义地图中.该方法将时序上的先验信息作为条件随机场中的数据项,依据超体素的邻接关系定义平滑项,利用图割法求解新增超体素的标签.实验表明，该方法能够获取准确的大规模语义地图,有效减少对冗余点的处理,改善图像上的标注结果.

Abstract:

In order to efficiently achieve accurate large scale scene understanding result, A new large scale dense semantic mapping system was proposed. The system constructed a map by incrementally calculating with a conditional random field model. The method used stereo visual odometry to get the motion of the camera, and used the labeled image sequences to build semantic map. The key point was to incrementally build the semantic map which detected newly built voxels, over segment the points within these voxels into supervoxels, labeled these supervoxels under the guidance of neighboring frames and used the rigid transformation matrix to fuse the newly labeled points with the already built map. A conditional random field model was constructed which took labeling results of sequential frames as the data term, took the coherent labeling constraint between neighboring supervoxels as the pairwise term and solved the model by graph cut. Experimental evaluations show that the approach can get an accurate large scale semantic map and decrease computational cost, The approach can improve the labeling results at image level.

出版日期: 2016-02-01

TP 242.6

基金资助:

国家自然科学基金青年基金资助项目(61001171)，国家“863”高技术研究发展计划资助项目(2014AA09A510).

通讯作者: 龚小谨，女，副教授. ORCID: 0000 0001 9955 3569. E-mail: gongxj@zju.edu.cn

作者简介: 江文婷（1990—），女，硕士生，从事大规模场景理解相关研究. ORCID: 0000 0002 7261 0532. E-mail: 3090103585@zju.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

引用本文:

江文婷, 龚小谨, 刘济林. 基于增量计算的大规模场景致密语义地图构建[J]. 浙江大学学报(工学版), 10.3785/j.issn.1008-973X.2016.02.026.

JIANG Wen ting, GONG Xiao jin, LIU Ji lin. Incremental large scale dense semantic mapping. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 10.3785/j.issn.1008-973X.2016.02.026.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2016.02.026 或 http://www.zjujournals.com/eng/CN/Y2016/V50/I2/385

［1］ BAILEY T, DURRANT WHYTE H. Simultaneous localization and mapping(SLAM):Part II ［J］. Robotics &Automation Magazine, 2006, 13(3): 108-117.
［2］ SENGUPTA S, GREVESON E, SHAHROKNI A, et al. Urban 3dsemantic modelling using stereo vision ［C］∥ Computer Vision ICRA. Karlsruhe: IEEE, 2013: 580-585.
［3］谭光华. 三维几何模型的形状编辑技术研究［D］.杭州:浙江大学, 2009: 3-17.
TAN Guang hua. Studies on shape editing techniquesof3d geometric models. Hangzhou: Zhejiang University, 2009: 3-17.
［4］ HE Hu, UPCROFT B. Nonparametric semantic segmentation for 3dstreet scenes ［C］∥IEEE IROS. Tokyo: IEEE,2013: 3697-3703.
［5］ KUNDU A, LI Y, DELLAERT F, LI F, et al. Joint semantic segmentation and 3d reconstruction from monocular video ［C］∥Computer Vision ECCV. Zurich: Springer, 2014: 703-718.
［6］ PAPON J, ABRAMOV A, SCHOELER M, WORGOTTER F. Voxel cloud connectivity segmentation supervoxels for point clouds ［C］∥IEEE CVPR. Portland, OR: IEEE, 2013: 2027-2034.
［7］ BOYKOV Y, JOLLY M. Interactive graph cuts for optimal boundary& region segmentation of objects in nd images ［C］∥IEEE ICCV. Vancouver, BC: IEEE, 2001: 105-112.
［8］ LU W, XIANG Z, LIU J. High performance visual odometry with two stage local binocular ［C］∥IEEEIV. Gold Coast, QLD: IEEE, 2013: 1107-1112.
［9］ TRIGGS B, MCLAUCHLAN P F, HARTLEY R I, FITZGIBBON A W. Bundle adjustment — a modern synthesis ［C］∥Vision algorithms: theory and practice. Corfu, Greece: Springer, 2000: 298-372.
［10］ GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving？ the kitti vision benchmark suite ［C］∥IEEE CVPR. Providence, RI: Springer, 2012: 3354-3361.
［11］ HUANG Wen qi, GONG Xiao jin. Fusion based holistic road scene understanding ［EB/OL］. (2014 06 29) [2015 07 22] http: ∥arxiv.org/pdf/1406.7525.pdf
［12］ LIU Jun yi, GONG Xiao jin. Guided depth enhancement via an isotropic diffusion［C］∥ Advances in Multimedia Information Processing(PCM). Nanjing, China: Springer, 2013: 408-417.
［13］ GOULD S, FULTON R, KOLLER D. Decomposing a scene into geometric and semantically consistent regions ［C］∥IEEE ICCV. Kyoto: IEEE, 2009: 1-8.
［14］ HORNUNG A, WURM K M, BENNEWITZ M, STACHNISS C, et al. OctoMap: an efficient probabilistic 3d mapping framework based on octrees ［J］. Autonomous Robots, 2013, 34(3): 189-206.

[1]	贾松敏,卢迎彬,王丽佳,李秀智,徐涛. 分层特征移动机器人行人跟踪[J]. 浙江大学学报(工学版), 2016, 50(9): 1677-1683.
[2]	马子昂,项志宇. 光流测距全向相机的标定与三维重构[J]. 浙江大学学报(工学版), 2015, 49(9): 1651-1657.
[3]	王立军,黄忠朝,赵于前. 基于超像素分割的空间相关主题模型及场景分类方法[J]. 浙江大学学报(工学版), 2015, 49(3): 402-408.
[4]	曹腾,项志宇,刘济林. 基于视差空间V-截距的障碍物检测[J]. 浙江大学学报(工学版), 2015, 49(3): 409-414.
[5]	卢维, 项志宇, 于海滨, 刘济林. 基于自适应多特征表观模型的目标压缩跟踪[J]. 浙江大学学报(工学版), 2014, 48(12): 2132-2138.
[6]	陈明芽, 项志宇, 刘济林. 单目视觉自然路标辅助的移动机器人定位方法[J]. J4, 2014, 48(2): 285-291.
[7]	林颖, 龚小谨, 刘济林. 基于单位视球的鱼眼相机标定方法[J]. J4, 2013, 47(8): 1500-1507.
[8]	王会方, 朱世强, 吴文祥. 谐波驱动伺服系统的改进自适应鲁棒控制[J]. J4, 2012, 46(10): 1757-1763.
[9]	欧阳柳,徐进,龚小谨,刘济林. 基于不确定性分析的视觉里程计优化[J]. J4, 2012, 46(9): 1572-1579.
[10]	马丽莎, 周文晖, 龚小谨, 刘济林. *基于运动约束的泛化Field D路径规划**[J]. J4, 2012, 46(8): 1546-1552.
[11]	路丹晖, 周文晖, 龚小谨, 刘济林. 视觉和IMU融合的移动机器人运动解耦估计[J]. J4, 2012, 46(6): 1021-1026.
[12]	徐进,沈敏一,杨力,王炜强,刘济林. 基于双目光束法平差的机器人定位与地形拼接[J]. J4, 2011, 45(7): 1141-1146.
[13]	陈家乾,柳玉甜,何衍,蒋静坪. 基于栅格模型和样本集合的动态环境地图创建[J]. J4, 2011, 45(5): 794-798.
[14]	陈家乾, 何衍, 蒋静坪. 基于权值平滑的改良FastSLAM算法[J]. J4, 2010, 44(8): 1454-1459.
[15]	徐生林, 刘艳娜. 两足机器人的SimMechanics建模[J]. J4, 2010, 44(7): 1361-1367.

Viewed

Full text

Abstract

Cited

Shared

Discussed