|
|
|
| Optimized ORB-SLAM3 algorithm incorporating YOLOv11n object detection for dynamic scenes |
Zhangyu XIE1( ),Jie YANG2,*( ),Siyuan OUYANG1,Yangjian ZENG1 |
1. School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou 341000, China 2. School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China |
|
|
|
Abstract To address the issues of low positioning accuracy and poor robustness in traditional visual simultaneous localization and mapping (SLAM) techniques within dynamic environments, an optimized ORB-SLAM3 algorithm incorporating YOLOv11n object detection was proposed. In the traditional system, an open neural network exchange (ONNX)-based YOLOv11n inference network was integrated to augment semantic information. Initial poses were generated using feature points from static regions, with map points then projected onto dynamic regions. A two-stage pose optimisation algorithm was integrated to filter static feature points and eliminate dynamic ones in dynamic regions, whereby pose estimation accuracy was improved and the number of high-quality feature points was increased. An additional thread was introduced beyond the original three, utilising pixels from keyframe regions to construct dense maps, providing rich environmental perception and understanding for subsequent human-computer interaction scenarios. Experimental results on the publicly available TUM dataset demonstrate that the proposed algorithm improves pose-estimation accuracy up to 98.3% relative to the baseline models. The proposed algorithm effectively mitigates the impact of dynamic objects on pose estimation while satisfying the requirements for dense map construction.
|
|
Received: 14 July 2025
Published: 03 February 2026
|
|
|
| Fund: 国家重点研发计划项目(2024YFB4303203-5). |
|
Corresponding Authors:
Jie YANG
E-mail: 2630777181@qq.com;yangjie@jxust.edu.cn
|
动态场景下融合YOLOv11n目标检测的优化ORB-SLAM3算法
针对传统视觉同步定位与建图(SLAM)技术在动态环境中定位精度低、鲁棒性差的问题,提出融合 YOLOv11n 目标检测的优化 ORB-SLAM3 算法. 在传统系统中融入基于开放式神经网络交换格式(ONNX) 推理的 YOLOv11n 网络,增加语义信息;利用静态区域特征点生成初始位姿,投影地图点至动态区域;结合双阶段位姿优化算法,在动态区域内筛选静态特征点及剔除动态特征点,提升位姿估计精度与增加优质特征点数量. 在原有3个线程外新增线程,利用关键帧区域像素点构建稠密地图,为后续的人机交互场景提供丰富的环境感知与理解. 在公开数据集TUM上的实验结果表明,在位姿估计精度方面,所提算法与基准模型相比最高提升98.3%. 所提算法能够有效消除动态物体对位姿估计的影响,满足稠密地图的构建需求.
关键词:
ORB-SLAM3,
开放式神经网络交换格式(ONNX),
YOLOv11n,
双阶段位姿优化算法,
稠密地图重建
|
|
| [1] |
CADENA C, CARLONE L, CARRILLO H, et al Past, present, and future of simultaneous localization and mapping: toward the robust-perception age[J]. IEEE Transactions on Robotics, 2016, 32 (6): 1309- 1332
doi: 10.1109/TRO.2016.2624754
|
|
|
| [2] |
王朋, 郝伟龙, 倪翠, 等 视觉SLAM方法综述[J]. 北京航空航天大学学报, 2024, 50 (2): 359- 367 WANG Peng, HAO Weilong, NI Cui, et al An overview of visual SLAM methods[J]. Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (2): 359- 367
doi: 10.13700/j.bh.1001-5965.2022.0376
|
|
|
| [3] |
QIN T, LI P, SHEN S VINS-mono: a robust and versatile monocular visual-inertial state estimator[J]. IEEE Transactions on Robotics, 2018, 34 (4): 1004- 1020
doi: 10.1109/TRO.2018.2853729
|
|
|
| [4] |
ENGEL J, KOLTUN V, CREMERS D Direct sparse odometry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (3): 611- 625
doi: 10.1109/TPAMI.2017.2658577
|
|
|
| [5] |
KLEIN G, MURRAY D. Parallel tracking and mapping for small AR workspaces [C]// Proceedings of the 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. Nara: IEEE, 2008: 225–234.
|
|
|
| [6] |
CARUSO D, ENGEL J, CREMERS D. Large-scale direct SLAM for omnidirectional cameras [C]// Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg: IEEE, 2015: 141–148.
|
|
|
| [7] |
MUR-ARTAL R, MONTIEL J M M, TARDÓS J D ORB-SLAM: a versatile and accurate monocular SLAM system[J]. IEEE Transactions on Robotics, 2015, 31 (5): 1147- 1163
doi: 10.1109/TRO.2015.2463671
|
|
|
| [8] |
FORSTER C, PIZZOLI M, SCARAMUZZA D. SVO: fast semi-direct monocular visual odometry [C]// Proceedings of the IEEE International Conference on Robotics and Automation. Hong Kong: IEEE, 2014: 15–22.
|
|
|
| [9] |
黄泽霞, 邵春莉 深度学习下的视觉SLAM综述[J]. 机器人, 2023, 45 (6): 756- 768 HUANG Zexia, SHAO Chunli Survey of visual SLAM based on deep learning[J]. Robot, 2023, 45 (6): 756- 768
doi: 10.13973/j.cnki.robot.220426
|
|
|
| [10] |
BESCOS B, FÁCIL J M, CIVERA J, et al DynaSLAM: tracking, mapping, and inpainting in dynamic scenes[J]. IEEE Robotics and Automation Letters, 2018, 3 (4): 4076- 4083
doi: 10.1109/LRA.2018.2860039
|
|
|
| [11] |
HU X, ZHANG Y, CAO Z, et al. CFP-SLAM: a real-time visual SLAM based on coarse-to-fine probability in dynamic environments [C]// Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Kyoto: IEEE, 2022: 4399–4406.
|
|
|
| [12] |
CHANG J, DONG N, LI D A real-time dynamic object segmentation framework for SLAM system in dynamic scenes[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 2513709
|
|
|
| [13] |
ZHANG J, HENEIN M, MAHONY R, et al. VDO-SLAM: a visual dynamic object-aware SLAM system [EB/OL]. (2021–12–14)[2025–07–03]. https://arxiv.org/pdf/2005.11052.
|
|
|
| [14] |
张玮奇, 王嘉, 张琳, 等 SUI-SLAM: 一种面向室内动态环境的融合语义和不确定度的视觉SLAM方法[J]. 机器人, 2024, 46 (6): 732- 742 ZHANG Weiqi, WANG Jia, ZHANG Lin, et al SUI-SLAM: a semantics and uncertainty incorporated visual SLAM algorithm towards dynamic indoor environments[J]. Robot, 2024, 46 (6): 732- 742
doi: 10.13973/j.cnki.robot.230195
|
|
|
| [15] |
翟伟光, 王峰, 马星宇, 等 YSG-SLAM: 动态场景下基于YOLACT的实时语义RGB-D SLAM系统[J]. 兵工学报, 2025, 46 (6): 167- 179 ZHAI Weiguang, WANG Feng, MA Xingyu, et al YSG-SLAM: a real-time semantic RGB-D SLAM based on YOLACT in dynamic scene[J]. Acta Armamentarii, 2025, 46 (6): 167- 179
doi: 10.12382/bgxb.2024.0443
|
|
|
| [16] |
刘钰嵩, 何丽, 袁亮, 等 动态场景下基于光流的语义RGBD-SLAM算法[J]. 仪器仪表学报, 2022, 43 (12): 139- 148 LIU Yusong, HE Li, YUAN Liang, et al Semantic RGBD-SLAM in dynamic scene based on optical flow[J]. Chinese Journal of Scientific Instrument, 2022, 43 (12): 139- 148
doi: 10.19650/j.cnki.cjsi.J2209856
|
|
|
| [17] |
CAMPOS C, ELVIRA R, RODRÍGUEZ J J G, et al ORB-SLAM3: an accurate open-source library for visual, visual–inertial, and multimap SLAM[J]. IEEE Transactions on Robotics, 2021, 37 (6): 1874- 1890
doi: 10.1109/TRO.2021.3075644
|
|
|
| [18] |
YU C, LIU Z, LIU X J, et al. DS-SLAM: a semantic visual SLAM towards dynamic environments [C]// Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid: IEEE, 2019: 1168–1174.
|
|
|
| [19] |
CHENG S, SUN C, ZHANG S, et al SG-SLAM: a real-time RGB-D visual SLAM toward dynamic scenes with semantic and geometric information[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 7501012
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|