Please wait a minute...
浙江大学学报(工学版)  2026, Vol. 60 Issue (2): 415-424    DOI: 10.3785/j.issn.1008-973X.2026.02.020
交通工程、土木工程     
基于多头自注意力-Bi-LSTM模型的盾构掘进引发的土体沉降预测
杨明辉1(),宋牧原1,*(),付大喜2,郭炎伟2,卢贤锥3,张文聪1,郑伟龙1
1. 厦门大学 建筑与土木工程学院,福建 厦门 361005
2. 河南省中工设计研究院集团股份有限公司,河南 郑州 451450
3. 福建省地质工程勘察院,福建 福州 350003
Prediction of shield tunneling-induced soil settlement based on multi-head self-attention-Bi-LSTM model
Minghui YANG1(),Muyuan SONG1,*(),Daxi FU2,Yanwei GUO2,Xianzhui LU3,Wencong ZHANG1,Weilong ZHENG1
1. School of Architecture and Civil Engineering, Xiamen University, Xiamen 361005, China
2. Henan Zhonggong Design & Research Group Co., Ltd., Zhengzhou 451450, China
3. Geological Engineering Survey in Fujian Province, Fuzhou 350003, China
 全文: PDF(1203 KB)   HTML
摘要:

为了提高盾构隧道施工引发的土体沉降预测精度,将双向长短期记忆(Bi-LSTM)模型分别结合自注意力(SA)机制和多头自注意力(MHSA)机制,提出有效捕捉数据时空特性和关键信息的深度学习模型. 该模型联合多个传感器的时序数据作为输入,利用多层双向网络架构和注意力机制捕获数据的关键特征及其内部的自相关性. 基于盾构隧道项目中土体沉降实测数据,采用交叉验证法对如隐藏层和注意力单元数量的超参数进行优化,对比引入不同注意力机制前后Bi-LSTM模型的土体沉降预测效果. 结果表明:MHSA-Bi-LSTM模型的预测效果最优,总平均绝对百分误差(1.27%)较SA-Bi-LSTM模型(2.53%)降低了约46%. 所提模型在未经参数重调的情况下对不同工程场景中的土体沉降具备较高预测精度,MHSA-Bi-LSTM和SA-Bi-LSTM的总平均绝对百分比误差分别为9.06%和14.82%,证明所提模型具备良好的泛化性.

关键词: 隧道工程沉降预测深度学习土体沉降多头自注意力机制    
Abstract:

To improve the prediction accuracy of soil settlement induced by shield tunnel construction, a deep learning model was proposed that combined the self-attention (SA) mechanism and multi-head self-attention (MHSA) mechanism separately with the bidirectional long short-term memory (Bi-LSTM) model, effectively capturing the spatiotemporal features and key information within the data. Using the time-series data from multiple sensors as inputs, the model employed a multi-layer bidirectional network architecture and attention mechanisms to capture the vital data features and their internal self-correlation. Based on the actual soil settlement data from a shield tunnel project, hyperparameters such as the number of hidden units and the number of attention units were optimized through cross-validation, and the predictive effects on soil settlement for the Bi-LSTM model before and after the introduction of various attention mechanisms were compared. Results show that the MHSA-Bi-LSTM model achieved optimal performance, with its total mean absolute percentage error (1.27%) showing approximately a 46% decrease over the SA-Bi-LSTM model (2.53%). Both models maintained high prediction accuracy for soil settlement across various engineering scenarios without parameter recalibration, exhibiting total mean absolute percentage errors of 9.06% for the MHSA-Bi-LSTM model and 14.82% for the SA-Bi-LSTM model, indicating strong generalization capability.

Key words: tunnel engineering    settlement prediction    deep learning    soil settlement    multi-head self-attention mechanism
收稿日期: 2025-02-19 出版日期: 2026-02-03
CLC:  U 452.2  
基金资助: 河南省重大科研专项项目(241111241000);自然资源部丘陵山地地质灾害防治重点实验室自主项目(KY-070000-04-2021-025).
通讯作者: 宋牧原     E-mail: mhyang@xmu.edu.cn;mysong@stu.xmu.edu.cn
作者简介: 杨明辉 (1978—),男,教授,从事盾构隧道智能掘进研究. orcid.org/0009-0008-0369-047X. E-mail:mhyang@xmu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
杨明辉
宋牧原
付大喜
郭炎伟
卢贤锥
张文聪
郑伟龙

引用本文:

杨明辉,宋牧原,付大喜,郭炎伟,卢贤锥,张文聪,郑伟龙. 基于多头自注意力-Bi-LSTM模型的盾构掘进引发的土体沉降预测[J]. 浙江大学学报(工学版), 2026, 60(2): 415-424.

Minghui YANG,Muyuan SONG,Daxi FU,Yanwei GUO,Xianzhui LU,Wencong ZHANG,Weilong ZHENG. Prediction of shield tunneling-induced soil settlement based on multi-head self-attention-Bi-LSTM model. Journal of ZheJiang University (Engineering Science), 2026, 60(2): 415-424.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.02.020        https://www.zjujournals.com/eng/CN/Y2026/V60/I2/415

图 1  多头自注意力机制的计算步骤
图 2  多头自注意力-Bi-LSTM模型架构
参数数值
MHSA-Bi-LSTMSA-Bi-LSTMBi-LSTM
L1-Nu32 (MHSA1)64 (SA1)
L1-NH6464128
L2-Nu64 (MHSA2)128 (SA2)
L2-NH128128128
学习率0.010.010.01
Dropout0.50.40.5
批大小12812864
迭代次数200175150
计算时间/s618536447
表 1  不同深度学习模型的超参数配置
土层编号E/(N?mm?2)νγ/(kN?m?3)c/kPaψ/(°)h/m
14.380.2918.41582.40
24.080.2518.226132.40
32.00.3317.312103.85
43.50.3517.7131312.00
58.00.3018.324217.20
表 2  盾构隧道项目的地层性质(案例1)
管片地层νγcψh
225左线(A&C)C18.527.313.2?10~?20
225右线(B&D)B0.167.549.2?10~?20
365左线(C&B&E)E19.135.316.7?10~?25
365右线(B&C)?10~?25
表 3  盾构隧道项目的地层性质(案例2)
图 3  不同深度学习模型的土体沉降预测结果对比(案例1)
图 4  不同深度学习模型的土体沉降预测结果的泰勒图解(案例1)
监测点模型SD/mmRCRMSD/mm
预测真实
Y310.18580.18170.94810.0593
20.19150.18170.89350.0866
30.19760.18170.76390.1311
Y410.17510.17000.95900.0497
20.10310.17000.91290.0868
30.05760.17000.81000.1279
Y510.05710.05840.95820.0168
20.04060.05840.78150.0368
30.17470.05840.79850.1327
表 4  不同深度学习模型在不同监测点的土体沉降预测泰勒图参数值(案例1)
监测点模型MSE/mmMAE/mmMAPE/%
Y310.01420.11161.9
20.04820.20603.5
30.15010.36616.2
Y410.00460.06080.8
20.04220.19092.5
30.08110.25713.3
Y510.00610.07661.1
20.01340.10991.6
30.02530.12411.8
表 5  不同深度学习模型的土体沉降预测性能指标对比(案例1)
监测点模型MSE/mmMAE/mmMAPE/%RE/mm
225-左10.1220.2959.720.38
20.1410.35715.550.59
225-右10.1150.2768.250.39
20.1380.33414.120.41
365-左10.1670.3789.280.37
20.1810.41514.230.61
365-右10.1570.2858.980.38
20.1740.39215.360.57
均值10.1400.3099.060.38
20.1590.37514.820.55
表 6  不同深度学习模型的土体沉降预测性能指标对比(案例2)
图 5  不同深度学习模型的模块消融实验结果(案例2)
1 中国城市轨道交通协会. 城市轨道交通2023年度统计和分析报告[R/OL]. (2024–03–29)[2025–01–12]. https://www.camet.org.cn/xytj/tjxx/14894.shtml.
2 张超, 朱闽湘, 郎志雄, 等 基于深度学习的盾构机土舱压力场预测方法[J]. 岩土工程学报, 2024, 46 (2): 307- 315
ZHANG Chao, ZHU Minxiang, LANG Zhixiong, et al Deep learning-based prediction method for chamber pressure field in shield machines[J]. Chinese Journal of Geotechnical Engineering, 2024, 46 (2): 307- 315
3 王海涛, 苏鹏, 孙昊宇, 等 软岩地层盾构隧道施工引起的地层沉降预测[J]. 岩石力学与工程学报, 2020, 39 (Suppl.2): 3549- 3556
WANG Haitao, SU Peng, SUN Haoyu, et al Prediction of ground settlement caused by shield construction in soft rock ground[J]. Chinese Journal of Rock Mechanics and Engineering, 2020, 39 (Suppl.2): 3549- 3556
doi: 10.13722/j.cnki.jrme.2019.1192
4 周中, 张俊杰, 丁昊晖, 等 基于GA-Bi-LSTM的盾构隧道下穿既有隧道沉降预测模型[J]. 岩石力学与工程学报, 2023, 42 (1): 224- 234
ZHOU Zhong, ZHANG Junjie, DING Haohui, et al Settlement prediction model of shield tunnel under-crossing existing tunnel based on GA-Bi-LSTM[J]. Chinese Journal of Rock Mechanics and Engineering, 2023, 42 (1): 224- 234
5 江帅, 朱勇, 栗青, 等 隧道开挖地表沉降动态预测及影响因素分析[J]. 岩土力学, 2022, 43 (1): 195- 204
JIANG Shuai, ZHU Yong, LI Qing, et al Dynamic prediction and influence factors analysis of ground surface settlement during tunnel excavation[J]. Rock and Soil Mechanics, 2022, 43 (1): 195- 204
doi: 10.16285/j.rsm.2021.1201
6 潘秋景, 吴洪涛, 张子龙, 等 基于多域物理信息神经网络的复合地层隧道掘进地表沉降预测[J]. 岩土力学, 2024, 45 (2): 539- 551
PAN Qiujing, WU Hongtao, ZHANG Zilong, et al Prediction of tunneling-induced ground surface settlement within composite strata using multi-physics-informed neural network[J]. Rock and Soil Mechanics, 2024, 45 (2): 539- 551
7 陈湘生, 曾仕琪, 韩文龙, 等 机器学习方法在盾构隧道工程中的应用研究现状与展望[J]. 土木与环境工程学报(中英文), 2024, 46 (1): 1- 13
CHEN Xiangsheng, ZENG Shiqi, HAN Wenlong, et al Review and prospect of machine learning method in shield tunnel construction[J]. Journal of Civil and Environmental Engineering, 2024, 46 (1): 1- 13
doi: 10.11835/j.issn.2096-6717.2022.069
8 YAN K, DAI Y, XU M, et al Tunnel surface settlement forecasting with ensemble learning[J]. Sustainability, 2020, 12 (1): 232
doi: 10.3390/su12010232
9 YE X W, JIN T, CHEN Y M Machine learning-based forecasting of soil settlement induced by shield tunneling construction[J]. Tunnelling and Underground Space Technology, 2022, 124: 104452
doi: 10.1016/j.tust.2022.104452
10 SONG M, YANG M, YAO G, et al Artificial intelligence driven tunneling-induced surface settlement prediction[J]. Automation in Construction, 2024, 168: 105819
doi: 10.1016/j.autcon.2024.105819
11 李洛宾, 龚晓南, 甘晓露, 等 基于循环神经网络的盾构隧道引发地面最大沉降预测[J]. 土木工程学报, 2020, 53 (Suppl.1): 13- 19
LI Luobin, GONG Xiaonan, GAN Xiaolu, et al Prediction of maximum ground settlement induced by shield tunneling based on recurrent neural network[J]. China Civil Engineering Journal, 2020, 53 (Suppl.1): 13- 19
12 CAO Y, ZHOU X, YAN K Deep learning neural network model for tunnel ground surface settlement prediction based on sensor data[J]. Mathematical Problems in Engineering, 2021, 2021 (1): 9488892
doi: 10.1155/2021/9488892
13 MA K, CHEN L P, FANG Q, et al Machine learning in conventional tunnel deformation in high in situ stress regions[J]. Symmetry, 2022, 14 (3): 513
doi: 10.3390/sym14030513
14 LI C, LI J, SHI Z, et al Prediction of surface settlement induced by large-diameter shield tunneling based on machine-learning algorithms[J]. Geofluids, 2022, 2022 (1): 4174768
doi: 10.1155/2022/4174768
15 YANG M, SONG M, GUO Y, et al Prediction of shield tunneling-induced ground settlement using LSTM architecture enhanced by multi-head self-attention mechanism[J]. Tunnelling and Underground Space Technology, 2025, 161: 106536
doi: 10.1016/j.tust.2025.106536
16 ZHANG W S, YUAN Y, LONG M, et al Prediction of surface settlement around subway foundation pits based on spatiotemporal characteristics and deep learning models[J]. Computers and Geotechnics, 2024, 168: 106149
doi: 10.1016/j.compgeo.2024.106149
17 SEON P, HWAN A, JUN P, et al Convolutional neural network-based safety evaluation method for structures with dynamic responses[J]. Expert Systems with Applications, 2020, 158: 113634
doi: 10.1016/j.eswa.2020.113634
18 CHEN C, TANG L, LU Y, et al Reconstruction of long-term strain data for structural health monitoring with a hybrid deep-learning and autoregressive model considering thermal effects[J]. Engineering Structures, 2023, 285: 116063
doi: 10.1016/j.engstruct.2023.116063
19 吴伟强. 基于CNN-LSTM的采空区地表沉降预测 [D]. 绵阳: 西南科技大学, 2022: 1–67.
WU Weiqiang. Prediction of surface settlement in goaf based on CNN-LSTM [D]. Mianyang: Southwest University of Science and Technology, 2022: 1–67.
20 洪宇超, 钱建固, 叶源新, 等 基于时空关联特征的CNN-LSTM模型在基坑工程变形预测中的应用[J]. 岩土工程学报, 2021, 43 (Suppl.2): 108- 111
HONG Yuchao, QIAN Jiangu, YE Yuanxin, et al Application of CNN-LSTM model based on spatiotemporal correlation characteristics in deformation prediction of excavation engineering[J]. Chinese Journal of Geotechnical Engineering, 2021, 43 (Suppl.2): 108- 111
21 LU Y, TANG L, CHEN C, et al Reconstruction of structural long-term acceleration response based on BiLSTM networks[J]. Engineering Structures, 2023, 285: 116000
doi: 10.1016/j.engstruct.2023.116000
22 JEONG S, FERGUSON M, HOU R, et al Sensor data reconstruction using bidirectional recurrent neural network with application to bridge monitoring[J]. Advanced Engineering Informatics, 2019, 42: 100991
doi: 10.1016/j.aei.2019.100991
23 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. [S.l.]: NIPS, 2017: 6000–6010.
24 ZHANG W, ZHANG P, YU Y, et al Missing data repairs for traffic flow with self-attention generative adversarial imputation net[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23 (7): 7919- 7930
doi: 10.1109/TITS.2021.3074564
25 FAN G, HE Z, LI J Structural dynamic response reconstruction using self-attention enhanced generative adversarial networks[J]. Engineering Structures, 2023, 276: 115334
doi: 10.1016/j.engstruct.2022.115334
26 高墨通, 杨维芳, 刘祖昱, 等 结合卷积神经网络和注意力机制的LSTM采空区地表沉降预测方法[J]. 测绘通报, 2024, (6): 53- 58
GAO Motong, YANG Weifang, LIU Zuyu, et al LSTM goaf surface subsidence prediction method combining convolutional neural network and attention mechanism[J]. Bulletin of Surveying and Mapping, 2024, (6): 53- 58
doi: 10.13474/j.cnki.11-2246.2024.0610
27 HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural Computation, 1997, 9 (8): 1735- 1780
doi: 10.1162/neco.1997.9.8.1735
28 MAHMOODZADEH A, MOHAMMADI M, DARAEI A, et al Forecasting maximum surface settlement caused by urban tunneling[J]. Automation in Construction, 2020, 120: 103375
doi: 10.1016/j.autcon.2020.103375
29 SALMAN A G, HERYADI Y, ABDURAHMAN E, et al Single layer & multi-layer long short-term memory (LSTM) model with intermediate variables for weather forecasting[J]. Procedia Computer Science, 2018, 135: 89- 98
doi: 10.1016/j.procs.2018.08.153
30 FATHNEJAT H, AHMADI-NEDUSHAN B, HOSSEININEJAD S, et al A data-driven structural damage identification approach using deep convolutional-attention-recurrent neural architecture under temperature variations[J]. Engineering Structures, 2023, 276: 115311
doi: 10.1016/j.engstruct.2022.115311
[1] 孙月,张兴兰. 基于双重引导的目标对抗攻击方法[J]. 浙江大学学报(工学版), 2026, 60(1): 81-89.
[2] 朱志航,闫云凤,齐冬莲. 基于扩散模型多模态提示的电力人员行为图像生成[J]. 浙江大学学报(工学版), 2026, 60(1): 43-51.
[3] 段继忠,李海源. 基于变分模型和Transformer的多尺度并行磁共振成像重建[J]. 浙江大学学报(工学版), 2025, 59(9): 1826-1837.
[4] 王福建,张泽天,陈喜群,王殿海. 基于多通道图聚合注意力机制的共享单车借还量预测[J]. 浙江大学学报(工学版), 2025, 59(9): 1986-1995.
[5] 张弘,张学成,王国强,顾潘龙,江楠. 基于三维视觉的软体机器人实时定位与控制[J]. 浙江大学学报(工学版), 2025, 59(8): 1574-1582.
[6] 林宜山,左景,卢树华. 基于多头自注意力机制与MLP-Interactor的多模态情感分析[J]. 浙江大学学报(工学版), 2025, 59(8): 1653-1661.
[7] 王圣举,张赞. 基于加速扩散模型的缺失值插补算法[J]. 浙江大学学报(工学版), 2025, 59(7): 1471-1480.
[8] 章东平,王大为,何数技,汤斯亮,刘志勇,刘中秋. 基于跨维度特征融合的航空发动机寿命预测[J]. 浙江大学学报(工学版), 2025, 59(7): 1504-1513.
[9] 蔡永青,韩成,权巍,陈兀迪. 基于注意力机制的视觉诱导晕动症评估模型[J]. 浙江大学学报(工学版), 2025, 59(6): 1110-1118.
[10] 王立红,刘新倩,李静,冯志全. 基于联邦学习和时空特征融合的网络入侵检测方法[J]. 浙江大学学报(工学版), 2025, 59(6): 1201-1210.
[11] 徐慧智,王秀青. 基于车辆图像特征的前车距离与速度感知[J]. 浙江大学学报(工学版), 2025, 59(6): 1219-1232.
[12] 陈赞,李冉,冯远静,李永强. 基于时间维超分辨率的视频快照压缩成像重构[J]. 浙江大学学报(工学版), 2025, 59(5): 956-963.
[13] 马莉,王永顺,胡瑶,范磊. 预训练长短时空交错Transformer在交通流预测中的应用[J]. 浙江大学学报(工学版), 2025, 59(4): 669-678.
[14] 麻建飞,贾港帅,江波,陈征,凌小康,贺少辉. 含空洞-减薄病害的超大跨隧道服役安全研究[J]. 浙江大学学报(工学版), 2025, 59(4): 698-705.
[15] 陈巧红,郭孟浩,方贤,孙麒. 基于跨模态级联扩散模型的图像描述方法[J]. 浙江大学学报(工学版), 2025, 59(4): 787-794.