基于双目视觉和改进YOLOv8的动态三维重建方法

doi:10.3785/j.issn.1008-973X.2025.07.012

浙江大学学报(工学版)

2025, Vol. 59

Issue (7): 1443-1450 DOI: 10.3785/j.issn.1008-973X.2025.07.012

计算机技术与控制工程

基于双目视觉和改进YOLOv8的动态三维重建方法

何婧瑶1(

),李鹏飞1,*(

),汪承志1,吕振鸣2,牟萍1

1. 重庆交通大学河海学院，重庆 400074
2. 重庆交通大学机电与车辆工程学院，重庆 400074

Dynamic 3D reconstruction method using binocular vision and improved YOLOv8

Jingyao HE1(

),Pengfei LI1,*(

),Chengzhi WANG1,Zhenming LV2,Ping MU1

1. College of River and Ocean Engineering, Chongqing Jiaotong University, Chongqing 400074, China
2. School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing 400074, China

全文: PDF(7266 KB) HTML

摘要：

为了确保施工过程中的安全和效率，提出施工现场动态三维重建技术. 部署双目摄像头对重建现场进行三维扫描获取模型基底和目标活动轨迹，基于YOLOv8模型引入注意力量表序列融合(ASF)模块形成YOLOv8-ASF框架，提高预测模型精度和性能，解决如目标遮挡、目标丢失的痛点. 融合改进的半全局立体匹配 (SGBM)算法，与YOLOv8-ASF集成YOLOv8-ASF-SGBM算法，实现基于二维图像的目标近实时识别和定位. 利用获取的深度信息，将动态要素行为轨迹三维动态投影至模型基底中，实现对真实施工现场的近实时、全视角监控. 实验结果表明：所提技术高精度三维复现了施工动态要素的运动轨迹，且与动态要素真实运动轨迹的相对误差小于5%，实现了基于二维图像视频信息的高精度全视角三维立体化监控，具有良好的应用场景和工程价值.

关键词： 动态三维重建; YOLOv8-注意力量表序列融合(ASF); 半全局立体匹配(SGBM)算法; 目标遮挡; 双目视觉

Abstract:

A dynamic 3D reconstruction technology for construction sites was proposed to ensure safety and efficiency in the construction process. A Binocular camera was deployed to scan the reconstruction site in 3D to obtain the model base and target activity trajectory. The YOLOv8 model was enhanced with an attentional scale sequence fusion (ASF) module to form the YOLOv8-ASF framework, which improved the accuracy and performance of the model, to solve the pain points such as target occlusion and target loss. The improved semi-global block matching (SGBM) algorithm was fused, and the YOLOv8-ASF-SGBM algorithm was integrated with the YOLOv8-ASF to achieve near-real-time target recognition and localization based on 2D images. The obtained depth information was used to 3D project the behavior trajectories of dynamic elements into the substrate, to realize the near-real-time and full-view monitoring of the real construction site. Experimental results show that the proposed technology reproduces the movement trajectory of construction dynamic elements in high-precision three-dimensional, and the relative error with the real motion trajectory of dynamic elements is less than 5%, which can realize high-precision full-view three-dimensional monitoring based on two-dimensional image and video information, and has good application scenarios and engineering value.

Key words: dynamic 3D reconstruction YOLOv8-attentional scale sequence fusion (ASF) semi-global block matching (SGBM) algorithm target occlusion binocular vision

收稿日期: 2024-05-31 出版日期: 2025-07-25

CLC:

TU 17

基金资助: 国家自然科学基金资助项目（52379115）；重庆市自然科学基金资助项目（CSTB2022NSCQ-MSX0509）.

通讯作者: 李鹏飞 E-mail: 622220960055@mails.cqjtu.edu.cn;lipengfei@cqjtu.edu.cn

作者简介: 何婧瑶（2000—），女，硕士生，从事智慧水利、人工智能研究. orcid.org/0009-0006-8543-5266. E-mail：622220960055@mails.cqjtu.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	何婧瑶
	李鹏飞
	汪承志
	吕振鸣
	牟萍

引用本文:

何婧瑶,李鹏飞,汪承志,吕振鸣,牟萍. 基于双目视觉和改进YOLOv8的动态三维重建方法[J]. 浙江大学学报(工学版), 2025, 59(7): 1443-1450.

Jingyao HE,Pengfei LI,Chengzhi WANG,Zhenming LV,Ping MU. Dynamic 3D reconstruction method using binocular vision and improved YOLOv8. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1443-1450.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.07.012 或 https://www.zjujournals.com/eng/CN/Y2025/V59/I7/1443

图 1 动态三维重建技术框架

图 2 施工现场的动静态要素布设模拟场景

图 3 双目相机标定原理

图 4 施工现场动静态要素布设模拟场景下的重建图

表 1 动静态要素真实尺寸与点云尺寸的比例误差

图 5 YOLOv8-注意力量表序列融合模型框架

图 6 注意力量表序列融合改进YOLOv8前后的模型目标识别结果可视化

图 7 不同YOLOv8模型的目标检测性能比较

图 8 双目测距原理

图 9 模拟场景下改进模型对挖掘机的识别定位

图 10 不同算法在4种模拟场景下的特征点匹配精度

图 11 模拟场景下挖掘机行为轨迹的动态重建

图 12 模拟场景下挖掘机行为轨迹的动态重建精度

1	CAO J B. Research on construction quality management of construction project [C]// Proceedings of the 2018 8th International Conference on Management, Education and Information (MEICI 2018). [S.l.]: Atlantis Press, 2018: 1275–1278.
2	DÍAZ CASELLES L M, GUEVARA J Sustainability performance in on-site construction processes: a systematic literature review[J]. Sustainability, 2024, 16 (3): 1047 doi: 10.3390/su16031047
3	CARVAJAL-ARANGO D, BAHAMÓN-JARAMILLO S, ARISTIZÁBAL-MONSALVE P, et al Relationships between lean and sustainable construction: positive impacts of lean practices over sustainability during construction phase[J]. Journal of Cleaner Production, 2019, 234: 1322- 1337 doi: 10.1016/j.jclepro.2019.05.216
4	LIU T, WANG N, FU Q, et al. Research on 3D reconstruction method based on laser rotation scanning [C]// Proceedings of the IEEE International Conference on Mechatronics and Automation. Tianjin: IEEE, 2019: 1600–1604.
5	YU Q, HELMHOLZ P, BELTON D Semantically enhanced 3D building model reconstruction from terrestrial laser-scanning data[J]. Journal of Surveying Engineering, 2017, 143 (4): 04017015 doi: 10.1061/(ASCE)SU.1943-5428.0000232
6	ABDULWAHAB S, RASHWAN H A, GARCIA M A, et al Monocular depth map estimation based on a multi-scale deep architecture and curvilinear saliency feature boosting[J]. Neural Computing and Applications, 2022, 34 (19): 16423- 16440 doi: 10.1007/s00521-022-07663-x
7	JIN S K, OU Y S. Feature-based monocular dynamic 3D object reconstruction [C]// Social Robotics. [S.l.]: Springer, 2018: 380–389.
8	ZHANG Y, GU J, RAO T, et al A shape reconstruction and measurement method for spherical hedges using binocular vision[J]. Frontiers in Plant Science, 2022, 13: 849821 doi: 10.3389/fpls.2022.849821
9	GAI Q Optimization of stereo matching in 3D reconstruction based on binocular vision[J]. Journal of Physics: Conference Series, 2018, 960 (1): 012029
10	XIANG R, JIANG H, YING Y Recognition of clustered tomatoes based on binocular stereo vision[J]. Computers and Electronics in Agriculture, 2014, 106: 75- 90 doi: 10.1016/j.compag.2014.05.006
11	KANG M, TING C M, TING F F, et al ASF-YOLO: a novel YOLO model with attentional scale sequence fusion for cell instance segmentation[J]. Image and Vision Computing, 2024, 147: 105057 doi: 10.1016/j.imavis.2024.105057
12	DUAN R, DENG H, TIAN M, et al SODA: a large-scale open site object detection dataset for deep learning in construction[J]. Automation in Construction, 2022, 142: 104499 doi: 10.1016/j.autcon.2022.104499
13	AN X, ZHOU L, LIU Z, et al Dataset and benchmark for detecting moving objects in construction sites[J]. Automation in Construction, 2021, 122: 103482 doi: 10.1016/j.autcon.2020.103482
14	HUANG H, BRENNER C, SESTER M A generative statistical approach to automatic 3D building roof reconstruction from laser scanning data[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2013, 79: 29- 43 doi: 10.1016/j.isprsjprs.2013.02.004
15	ZHANG Y, HAO Y A survey of SAR image target detection based on convolutional neural networks[J]. Remote Sensing, 2022, 14 (24): 6240 doi: 10.3390/rs14246240
16	TEIXEIRA E, ARAUJO B, COSTA V, et al Literature review on ship localization, classification, and detection methods based on optical sensors and neural networks[J]. Sensors, 2022, 22 (18): 6879 doi: 10.3390/s22186879
17	白博建筑防火技术在民用建筑设计中的应用研究[J]. 消防界(电子版), 2025, 11 (2): 64- 66 BAI Bo Research on the application of building fire protection technology in civil building design[J]. Fire Protection Circle (Electronic Edition), 2025, 11 (2): 64- 66
18	林绿开, 钮倩倩, 李毅基于棋盘标定板的优化相机参数标定方法[J]. 计算机技术与发展, 2023, 33 (12): 101- 105 LIN Lyukai, NIU Qianqian, LI Yi Optimized camera parameter calibration method based on checkerboard[J]. Computer Technology and Development, 2023, 33 (12): 101- 105 doi: 10.3969/j.issn.1673-629X.2023.12.014
19	冯晓硕, 沈樾, 王冬琦基于图像的数据增强方法发展现状综述[J]. 计算机科学与应用, 2021, (2): 370- 382 FENG Xiaoshuo, SHEN Yue, WANG Dongqi A survey on the development of image data augmentation[J]. Computer Science and Application, 2021, (2): 370- 382
20	LIANG H, LIU M, HUI M, et al. 3D reconstruction of typical entities based on multi-perspective images [C]// Proceedings of the Optical Metrology and Inspection for Industrial Applications IX. [S.l.]: SPIE, 2022: 56.
21	GUPTA S K, SHUKLA D P Application of drone for landslide mapping, dimension estimation and its 3D reconstruction[J]. Journal of the Indian Society of Remote Sensing, 2018, 46 (6): 903- 914 doi: 10.1007/s12524-017-0727-1
22	ZHAI X, HUANG Z, LI T, et al YOLO-drone: an optimized YOLOv8 network for tiny UAV object detection[J]. Electronics, 2023, 12 (17): 3664
23	凌雄娟, 周云郊, 彭建喜基于双目立体视觉的乘员运动姿态测量方法研究[J]. 机械设计与制造工程, 2024, 53 (1): 126- 130 LING Xiongjuan, ZHOU Yunjiao, PENG Jianxi Research on test method of occupant motion attitude measurement based on binocular stereo vision[J]. Machine Design and Manufacturing Engineering, 2024, 53 (1): 126- 130 doi: 10.3969/j.issn.2095-509X.2024.01.026
24	XIAO H, TENG X, LIU C, et al A review of deep learning-based three-dimensional medical image registration methods[J]. Quantitative Imaging in Medicine and Surgery, 2021, 11 (12): 4895- 4916 doi: 10.21037/qims-21-175
25	刘志勇, 王淑贤基于鼠标事件的虚拟仿真实验中学习状态评价模型研究[J]. 软件工程, 2022, 25 (10): 37- 40 LIU Zhiyong, WANG Shuxian Research on learning state evaluation model in virtual simulation experiment based on mouse events[J]. Software Engineering, 2022, 25 (10): 37- 40

[1]	魏翠婷,赵唯坚,孙博超,刘芸怡. 基于改进Mask R-CNN与双目视觉的智能配筋检测[J]. 浙江大学学报(工学版), 2024, 58(5): 1009-1019.
[2]	马浩然,丁雅斌. 基于双目视觉的激光位移传感器标定方法[J]. 浙江大学学报(工学版), 2021, 55(9): 1634-1642.
[3]	柯显信, 张文朕, 杨阳, 温雷. 仿人机器人多传感器定位系统[J]. 浙江大学学报(工学版), 2018, 52(7): 1247-1252.
[4]	王晨学, 平雪良, 徐超. 解决约束平面偏移问题的机械臂闭环标定[J]. 浙江大学学报(工学版), 2018, 52(11): 2110-2119.
[5]	刘中, 陈伟海, 吴星明, 邹宇华, 王建华. 基于双目视觉的显著性区域检测[J]. J4, 2014, 48(2): 354-359.

Viewed

Full text

Abstract

Cited

Shared

Discussed