Please wait a minute...
浙江大学学报(工学版)  2025, Vol. 59 Issue (7): 1443-1450    DOI: 10.3785/j.issn.1008-973X.2025.07.012
计算机技术与控制工程     
基于双目视觉和改进YOLOv8的动态三维重建方法
何婧瑶1(),李鹏飞1,*(),汪承志1,吕振鸣2,牟萍1
1. 重庆交通大学 河海学院,重庆 400074
2. 重庆交通大学 机电与车辆工程学院,重庆 400074
Dynamic 3D reconstruction method using binocular vision and improved YOLOv8
Jingyao HE1(),Pengfei LI1,*(),Chengzhi WANG1,Zhenming LV2,Ping MU1
1. College of River and Ocean Engineering, Chongqing Jiaotong University, Chongqing 400074, China
2. School of Mechatronics and Vehicle Engineering, Chongqing Jiaotong University, Chongqing 400074, China
 全文: PDF(7266 KB)   HTML
摘要:

为了确保施工过程中的安全和效率,提出施工现场动态三维重建技术. 部署双目摄像头对重建现场进行三维扫描获取模型基底和目标活动轨迹,基于YOLOv8模型引入注意力量表序列融合(ASF)模块形成YOLOv8-ASF框架,提高预测模型精度和性能,解决如目标遮挡、目标丢失的痛点. 融合改进的半全局立体匹配 (SGBM)算法,与YOLOv8-ASF集成YOLOv8-ASF-SGBM算法,实现基于二维图像的目标近实时识别和定位. 利用获取的深度信息,将动态要素行为轨迹三维动态投影至模型基底中,实现对真实施工现场的近实时、全视角监控. 实验结果表明:所提技术高精度三维复现了施工动态要素的运动轨迹,且与动态要素真实运动轨迹的相对误差小于5%,实现了基于二维图像视频信息的高精度全视角三维立体化监控,具有良好的应用场景和工程价值.

关键词: 动态三维重建YOLOv8-注意力量表序列融合(ASF)半全局立体匹配(SGBM)算法目标遮挡双目视觉    
Abstract:

A dynamic 3D reconstruction technology for construction sites was proposed to ensure safety and efficiency in the construction process. A Binocular camera was deployed to scan the reconstruction site in 3D to obtain the model base and target activity trajectory. The YOLOv8 model was enhanced with an attentional scale sequence fusion (ASF) module to form the YOLOv8-ASF framework, which improved the accuracy and performance of the model, to solve the pain points such as target occlusion and target loss. The improved semi-global block matching (SGBM) algorithm was fused, and the YOLOv8-ASF-SGBM algorithm was integrated with the YOLOv8-ASF to achieve near-real-time target recognition and localization based on 2D images. The obtained depth information was used to 3D project the behavior trajectories of dynamic elements into the substrate, to realize the near-real-time and full-view monitoring of the real construction site. Experimental results show that the proposed technology reproduces the movement trajectory of construction dynamic elements in high-precision three-dimensional, and the relative error with the real motion trajectory of dynamic elements is less than 5%, which can realize high-precision full-view three-dimensional monitoring based on two-dimensional image and video information, and has good application scenarios and engineering value.

Key words: dynamic 3D reconstruction    YOLOv8-attentional scale sequence fusion (ASF)    semi-global block matching (SGBM) algorithm    target occlusion    binocular vision
收稿日期: 2024-05-31 出版日期: 2025-07-25
CLC:  TU 17  
基金资助: 国家自然科学基金资助项目(52379115);重庆市自然科学基金资助项目(CSTB2022NSCQ-MSX0509).
通讯作者: 李鹏飞     E-mail: 622220960055@mails.cqjtu.edu.cn;lipengfei@cqjtu.edu.cn
作者简介: 何婧瑶(2000—),女,硕士生,从事智慧水利、人工智能研究. orcid.org/0009-0006-8543-5266. E-mail:622220960055@mails.cqjtu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
何婧瑶
李鹏飞
汪承志
吕振鸣
牟萍

引用本文:

何婧瑶,李鹏飞,汪承志,吕振鸣,牟萍. 基于双目视觉和改进YOLOv8的动态三维重建方法[J]. 浙江大学学报(工学版), 2025, 59(7): 1443-1450.

Jingyao HE,Pengfei LI,Chengzhi WANG,Zhenming LV,Ping MU. Dynamic 3D reconstruction method using binocular vision and improved YOLOv8. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1443-1450.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.07.012        https://www.zjujournals.com/eng/CN/Y2025/V59/I7/1443

图 1  动态三维重建技术框架
图 2  施工现场的动静态要素布设模拟场景
图 3  双目相机标定原理
图 4  施工现场动静态要素布设模拟场景下的重建图
元素L/cmW/cmH/cmRrlwhRcPE/%
建筑物37.528.150.01∶0.75∶1.3337.8728.2449.761∶0.75∶1.311.5
挖掘机6117411∶0.28∶0.6762.7917.5440.951∶0.28∶0.653.0
表 1  动静态要素真实尺寸与点云尺寸的比例误差
图 5  YOLOv8-注意力量表序列融合模型框架
图 6  注意力量表序列融合改进YOLOv8前后的模型目标识别结果可视化
图 7  不同YOLOv8模型的目标检测性能比较
图 8  双目测距原理
图 9  模拟场景下改进模型对挖掘机的识别定位
图 10  不同算法在4种模拟场景下的特征点匹配精度
图 11  模拟场景下挖掘机行为轨迹的动态重建
图 12  模拟场景下挖掘机行为轨迹的动态重建精度
1 CAO J B. Research on construction quality management of construction project [C]// Proceedings of the 2018 8th International Conference on Management, Education and Information (MEICI 2018). [S.l.]: Atlantis Press, 2018: 1275–1278.
2 DÍAZ CASELLES L M, GUEVARA J Sustainability performance in on-site construction processes: a systematic literature review[J]. Sustainability, 2024, 16 (3): 1047
doi: 10.3390/su16031047
3 CARVAJAL-ARANGO D, BAHAMÓN-JARAMILLO S, ARISTIZÁBAL-MONSALVE P, et al Relationships between lean and sustainable construction: positive impacts of lean practices over sustainability during construction phase[J]. Journal of Cleaner Production, 2019, 234: 1322- 1337
doi: 10.1016/j.jclepro.2019.05.216
4 LIU T, WANG N, FU Q, et al. Research on 3D reconstruction method based on laser rotation scanning [C]// Proceedings of the IEEE International Conference on Mechatronics and Automation. Tianjin: IEEE, 2019: 1600–1604.
5 YU Q, HELMHOLZ P, BELTON D Semantically enhanced 3D building model reconstruction from terrestrial laser-scanning data[J]. Journal of Surveying Engineering, 2017, 143 (4): 04017015
doi: 10.1061/(ASCE)SU.1943-5428.0000232
6 ABDULWAHAB S, RASHWAN H A, GARCIA M A, et al Monocular depth map estimation based on a multi-scale deep architecture and curvilinear saliency feature boosting[J]. Neural Computing and Applications, 2022, 34 (19): 16423- 16440
doi: 10.1007/s00521-022-07663-x
7 JIN S K, OU Y S. Feature-based monocular dynamic 3D object reconstruction [C]// Social Robotics. [S.l.]: Springer, 2018: 380–389.
8 ZHANG Y, GU J, RAO T, et al A shape reconstruction and measurement method for spherical hedges using binocular vision[J]. Frontiers in Plant Science, 2022, 13: 849821
doi: 10.3389/fpls.2022.849821
9 GAI Q Optimization of stereo matching in 3D reconstruction based on binocular vision[J]. Journal of Physics: Conference Series, 2018, 960 (1): 012029
10 XIANG R, JIANG H, YING Y Recognition of clustered tomatoes based on binocular stereo vision[J]. Computers and Electronics in Agriculture, 2014, 106: 75- 90
doi: 10.1016/j.compag.2014.05.006
11 KANG M, TING C M, TING F F, et al ASF-YOLO: a novel YOLO model with attentional scale sequence fusion for cell instance segmentation[J]. Image and Vision Computing, 2024, 147: 105057
doi: 10.1016/j.imavis.2024.105057
12 DUAN R, DENG H, TIAN M, et al SODA: a large-scale open site object detection dataset for deep learning in construction[J]. Automation in Construction, 2022, 142: 104499
doi: 10.1016/j.autcon.2022.104499
13 AN X, ZHOU L, LIU Z, et al Dataset and benchmark for detecting moving objects in construction sites[J]. Automation in Construction, 2021, 122: 103482
doi: 10.1016/j.autcon.2020.103482
14 HUANG H, BRENNER C, SESTER M A generative statistical approach to automatic 3D building roof reconstruction from laser scanning data[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2013, 79: 29- 43
doi: 10.1016/j.isprsjprs.2013.02.004
15 ZHANG Y, HAO Y A survey of SAR image target detection based on convolutional neural networks[J]. Remote Sensing, 2022, 14 (24): 6240
doi: 10.3390/rs14246240
16 TEIXEIRA E, ARAUJO B, COSTA V, et al Literature review on ship localization, classification, and detection methods based on optical sensors and neural networks[J]. Sensors, 2022, 22 (18): 6879
doi: 10.3390/s22186879
17 白博 建筑防火技术在民用建筑设计中的应用研究[J]. 消防界(电子版), 2025, 11 (2): 64- 66
BAI Bo Research on the application of building fire protection technology in civil building design[J]. Fire Protection Circle (Electronic Edition), 2025, 11 (2): 64- 66
18 林绿开, 钮倩倩, 李毅 基于棋盘标定板的优化相机参数标定方法[J]. 计算机技术与发展, 2023, 33 (12): 101- 105
LIN Lyukai, NIU Qianqian, LI Yi Optimized camera parameter calibration method based on checkerboard[J]. Computer Technology and Development, 2023, 33 (12): 101- 105
doi: 10.3969/j.issn.1673-629X.2023.12.014
19 冯晓硕, 沈樾, 王冬琦 基于图像的数据增强方法发展现状综述[J]. 计算机科学与应用, 2021, (2): 370- 382
FENG Xiaoshuo, SHEN Yue, WANG Dongqi A survey on the development of image data augmentation[J]. Computer Science and Application, 2021, (2): 370- 382
20 LIANG H, LIU M, HUI M, et al. 3D reconstruction of typical entities based on multi-perspective images [C]// Proceedings of the Optical Metrology and Inspection for Industrial Applications IX. [S.l.]: SPIE, 2022: 56.
21 GUPTA S K, SHUKLA D P Application of drone for landslide mapping, dimension estimation and its 3D reconstruction[J]. Journal of the Indian Society of Remote Sensing, 2018, 46 (6): 903- 914
doi: 10.1007/s12524-017-0727-1
22 ZHAI X, HUANG Z, LI T, et al YOLO-drone: an optimized YOLOv8 network for tiny UAV object detection[J]. Electronics, 2023, 12 (17): 3664
23 凌雄娟, 周云郊, 彭建喜 基于双目立体视觉的乘员运动姿态测量方法研究[J]. 机械设计与制造工程, 2024, 53 (1): 126- 130
LING Xiongjuan, ZHOU Yunjiao, PENG Jianxi Research on test method of occupant motion attitude measurement based on binocular stereo vision[J]. Machine Design and Manufacturing Engineering, 2024, 53 (1): 126- 130
doi: 10.3969/j.issn.2095-509X.2024.01.026
24 XIAO H, TENG X, LIU C, et al A review of deep learning-based three-dimensional medical image registration methods[J]. Quantitative Imaging in Medicine and Surgery, 2021, 11 (12): 4895- 4916
doi: 10.21037/qims-21-175
25 刘志勇, 王淑贤 基于鼠标事件的虚拟仿真实验中学习状态评价模型研究[J]. 软件工程, 2022, 25 (10): 37- 40
LIU Zhiyong, WANG Shuxian Research on learning state evaluation model in virtual simulation experiment based on mouse events[J]. Software Engineering, 2022, 25 (10): 37- 40
[1] 魏翠婷,赵唯坚,孙博超,刘芸怡. 基于改进Mask R-CNN与双目视觉的智能配筋检测[J]. 浙江大学学报(工学版), 2024, 58(5): 1009-1019.
[2] 马浩然,丁雅斌. 基于双目视觉的激光位移传感器标定方法[J]. 浙江大学学报(工学版), 2021, 55(9): 1634-1642.
[3] 柯显信, 张文朕, 杨阳, 温雷. 仿人机器人多传感器定位系统[J]. 浙江大学学报(工学版), 2018, 52(7): 1247-1252.
[4] 王晨学, 平雪良, 徐超. 解决约束平面偏移问题的机械臂闭环标定[J]. 浙江大学学报(工学版), 2018, 52(11): 2110-2119.
[5] 刘中, 陈伟海, 吴星明, 邹宇华, 王建华. 基于双目视觉的显著性区域检测[J]. J4, 2014, 48(2): 354-359.