Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2026, Vol. 60 Issue (2): 415-424    DOI: 10.3785/j.issn.1008-973X.2026.02.020
    
Prediction of shield tunneling-induced soil settlement based on multi-head self-attention-Bi-LSTM model
Minghui YANG1(),Muyuan SONG1,*(),Daxi FU2,Yanwei GUO2,Xianzhui LU3,Wencong ZHANG1,Weilong ZHENG1
1. School of Architecture and Civil Engineering, Xiamen University, Xiamen 361005, China
2. Henan Zhonggong Design & Research Group Co., Ltd., Zhengzhou 451450, China
3. Geological Engineering Survey in Fujian Province, Fuzhou 350003, China
Download: HTML     PDF(1203KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

To improve the prediction accuracy of soil settlement induced by shield tunnel construction, a deep learning model was proposed that combined the self-attention (SA) mechanism and multi-head self-attention (MHSA) mechanism separately with the bidirectional long short-term memory (Bi-LSTM) model, effectively capturing the spatiotemporal features and key information within the data. Using the time-series data from multiple sensors as inputs, the model employed a multi-layer bidirectional network architecture and attention mechanisms to capture the vital data features and their internal self-correlation. Based on the actual soil settlement data from a shield tunnel project, hyperparameters such as the number of hidden units and the number of attention units were optimized through cross-validation, and the predictive effects on soil settlement for the Bi-LSTM model before and after the introduction of various attention mechanisms were compared. Results show that the MHSA-Bi-LSTM model achieved optimal performance, with its total mean absolute percentage error (1.27%) showing approximately a 46% decrease over the SA-Bi-LSTM model (2.53%). Both models maintained high prediction accuracy for soil settlement across various engineering scenarios without parameter recalibration, exhibiting total mean absolute percentage errors of 9.06% for the MHSA-Bi-LSTM model and 14.82% for the SA-Bi-LSTM model, indicating strong generalization capability.



Key wordstunnel engineering      settlement prediction      deep learning      soil settlement      multi-head self-attention mechanism     
Received: 19 February 2025      Published: 03 February 2026
CLC:  U 452.2  
Fund:  河南省重大科研专项项目(241111241000);自然资源部丘陵山地地质灾害防治重点实验室自主项目(KY-070000-04-2021-025).
Corresponding Authors: Muyuan SONG     E-mail: mhyang@xmu.edu.cn;mysong@stu.xmu.edu.cn
Cite this article:

Minghui YANG,Muyuan SONG,Daxi FU,Yanwei GUO,Xianzhui LU,Wencong ZHANG,Weilong ZHENG. Prediction of shield tunneling-induced soil settlement based on multi-head self-attention-Bi-LSTM model. Journal of ZheJiang University (Engineering Science), 2026, 60(2): 415-424.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2026.02.020     OR     https://www.zjujournals.com/eng/Y2026/V60/I2/415


基于多头自注意力-Bi-LSTM模型的盾构掘进引发的土体沉降预测

为了提高盾构隧道施工引发的土体沉降预测精度,将双向长短期记忆(Bi-LSTM)模型分别结合自注意力(SA)机制和多头自注意力(MHSA)机制,提出有效捕捉数据时空特性和关键信息的深度学习模型. 该模型联合多个传感器的时序数据作为输入,利用多层双向网络架构和注意力机制捕获数据的关键特征及其内部的自相关性. 基于盾构隧道项目中土体沉降实测数据,采用交叉验证法对如隐藏层和注意力单元数量的超参数进行优化,对比引入不同注意力机制前后Bi-LSTM模型的土体沉降预测效果. 结果表明:MHSA-Bi-LSTM模型的预测效果最优,总平均绝对百分误差(1.27%)较SA-Bi-LSTM模型(2.53%)降低了约46%. 所提模型在未经参数重调的情况下对不同工程场景中的土体沉降具备较高预测精度,MHSA-Bi-LSTM和SA-Bi-LSTM的总平均绝对百分比误差分别为9.06%和14.82%,证明所提模型具备良好的泛化性.


关键词: 隧道工程,  沉降预测,  深度学习,  土体沉降,  多头自注意力机制 
Fig.1 Calculation procedure of multi-head self-attention mechanism
Fig.2 Architecture of multi-head self-attention-Bi-LSTM model
参数数值
MHSA-Bi-LSTMSA-Bi-LSTMBi-LSTM
L1-Nu32 (MHSA1)64 (SA1)
L1-NH6464128
L2-Nu64 (MHSA2)128 (SA2)
L2-NH128128128
学习率0.010.010.01
Dropout0.50.40.5
批大小12812864
迭代次数200175150
计算时间/s618536447
Tab.1 Hyperparameter configuration of different deep learning models
土层编号E/(N?mm?2)νγ/(kN?m?3)c/kPaψ/(°)h/m
14.380.2918.41582.40
24.080.2518.226132.40
32.00.3317.312103.85
43.50.3517.7131312.00
58.00.3018.324217.20
Tab.2 Properties of strata in shield tunnel program (case 1)
管片地层νγcψh
225左线(A&C)C18.527.313.2?10~?20
225右线(B&D)B0.167.549.2?10~?20
365左线(C&B&E)E19.135.316.7?10~?25
365右线(B&C)?10~?25
Tab.3 Properties of strata in shield tunnel program (case 2)
Fig.3 Comparison of soil settlement predictions across different deep learning models (case 1)
Fig.4 Taylor diagram graphical presentation of soil settlement predictions produced by different deep learning models (case 1)
监测点模型SD/mmRCRMSD/mm
预测真实
Y310.18580.18170.94810.0593
20.19150.18170.89350.0866
30.19760.18170.76390.1311
Y410.17510.17000.95900.0497
20.10310.17000.91290.0868
30.05760.17000.81000.1279
Y510.05710.05840.95820.0168
20.04060.05840.78150.0368
30.17470.05840.79850.1327
Tab.4 Taylor diagram metrics of soil settlement predictions by different deep learning models at various monitoring points
监测点模型MSE/mmMAE/mmMAPE/%
Y310.01420.11161.9
20.04820.20603.5
30.15010.36616.2
Y410.00460.06080.8
20.04220.19092.5
30.08110.25713.3
Y510.00610.07661.1
20.01340.10991.6
30.02530.12411.8
Tab.5 Comparison of soil settlement prediction performance metrics across different deep learning models (case 1)
监测点模型MSE/mmMAE/mmMAPE/%RE/mm
225-左10.1220.2959.720.38
20.1410.35715.550.59
225-右10.1150.2768.250.39
20.1380.33414.120.41
365-左10.1670.3789.280.37
20.1810.41514.230.61
365-右10.1570.2858.980.38
20.1740.39215.360.57
均值10.1400.3099.060.38
20.1590.37514.820.55
Tab.6 Comparison of soil settlement prediction performance metrics across different deep learning models (case 2)
Fig.5 Ablation experiment results for different deep learning models (case 2)
[1]   中国城市轨道交通协会. 城市轨道交通2023年度统计和分析报告[R/OL]. (2024–03–29)[2025–01–12]. https://www.camet.org.cn/xytj/tjxx/14894.shtml.
[2]   张超, 朱闽湘, 郎志雄, 等 基于深度学习的盾构机土舱压力场预测方法[J]. 岩土工程学报, 2024, 46 (2): 307- 315
ZHANG Chao, ZHU Minxiang, LANG Zhixiong, et al Deep learning-based prediction method for chamber pressure field in shield machines[J]. Chinese Journal of Geotechnical Engineering, 2024, 46 (2): 307- 315
[3]   王海涛, 苏鹏, 孙昊宇, 等 软岩地层盾构隧道施工引起的地层沉降预测[J]. 岩石力学与工程学报, 2020, 39 (Suppl.2): 3549- 3556
WANG Haitao, SU Peng, SUN Haoyu, et al Prediction of ground settlement caused by shield construction in soft rock ground[J]. Chinese Journal of Rock Mechanics and Engineering, 2020, 39 (Suppl.2): 3549- 3556
doi: 10.13722/j.cnki.jrme.2019.1192
[4]   周中, 张俊杰, 丁昊晖, 等 基于GA-Bi-LSTM的盾构隧道下穿既有隧道沉降预测模型[J]. 岩石力学与工程学报, 2023, 42 (1): 224- 234
ZHOU Zhong, ZHANG Junjie, DING Haohui, et al Settlement prediction model of shield tunnel under-crossing existing tunnel based on GA-Bi-LSTM[J]. Chinese Journal of Rock Mechanics and Engineering, 2023, 42 (1): 224- 234
[5]   江帅, 朱勇, 栗青, 等 隧道开挖地表沉降动态预测及影响因素分析[J]. 岩土力学, 2022, 43 (1): 195- 204
JIANG Shuai, ZHU Yong, LI Qing, et al Dynamic prediction and influence factors analysis of ground surface settlement during tunnel excavation[J]. Rock and Soil Mechanics, 2022, 43 (1): 195- 204
doi: 10.16285/j.rsm.2021.1201
[6]   潘秋景, 吴洪涛, 张子龙, 等 基于多域物理信息神经网络的复合地层隧道掘进地表沉降预测[J]. 岩土力学, 2024, 45 (2): 539- 551
PAN Qiujing, WU Hongtao, ZHANG Zilong, et al Prediction of tunneling-induced ground surface settlement within composite strata using multi-physics-informed neural network[J]. Rock and Soil Mechanics, 2024, 45 (2): 539- 551
[7]   陈湘生, 曾仕琪, 韩文龙, 等 机器学习方法在盾构隧道工程中的应用研究现状与展望[J]. 土木与环境工程学报(中英文), 2024, 46 (1): 1- 13
CHEN Xiangsheng, ZENG Shiqi, HAN Wenlong, et al Review and prospect of machine learning method in shield tunnel construction[J]. Journal of Civil and Environmental Engineering, 2024, 46 (1): 1- 13
doi: 10.11835/j.issn.2096-6717.2022.069
[8]   YAN K, DAI Y, XU M, et al Tunnel surface settlement forecasting with ensemble learning[J]. Sustainability, 2020, 12 (1): 232
doi: 10.3390/su12010232
[9]   YE X W, JIN T, CHEN Y M Machine learning-based forecasting of soil settlement induced by shield tunneling construction[J]. Tunnelling and Underground Space Technology, 2022, 124: 104452
doi: 10.1016/j.tust.2022.104452
[10]   SONG M, YANG M, YAO G, et al Artificial intelligence driven tunneling-induced surface settlement prediction[J]. Automation in Construction, 2024, 168: 105819
doi: 10.1016/j.autcon.2024.105819
[11]   李洛宾, 龚晓南, 甘晓露, 等 基于循环神经网络的盾构隧道引发地面最大沉降预测[J]. 土木工程学报, 2020, 53 (Suppl.1): 13- 19
LI Luobin, GONG Xiaonan, GAN Xiaolu, et al Prediction of maximum ground settlement induced by shield tunneling based on recurrent neural network[J]. China Civil Engineering Journal, 2020, 53 (Suppl.1): 13- 19
[12]   CAO Y, ZHOU X, YAN K Deep learning neural network model for tunnel ground surface settlement prediction based on sensor data[J]. Mathematical Problems in Engineering, 2021, 2021 (1): 9488892
doi: 10.1155/2021/9488892
[13]   MA K, CHEN L P, FANG Q, et al Machine learning in conventional tunnel deformation in high in situ stress regions[J]. Symmetry, 2022, 14 (3): 513
doi: 10.3390/sym14030513
[14]   LI C, LI J, SHI Z, et al Prediction of surface settlement induced by large-diameter shield tunneling based on machine-learning algorithms[J]. Geofluids, 2022, 2022 (1): 4174768
doi: 10.1155/2022/4174768
[15]   YANG M, SONG M, GUO Y, et al Prediction of shield tunneling-induced ground settlement using LSTM architecture enhanced by multi-head self-attention mechanism[J]. Tunnelling and Underground Space Technology, 2025, 161: 106536
doi: 10.1016/j.tust.2025.106536
[16]   ZHANG W S, YUAN Y, LONG M, et al Prediction of surface settlement around subway foundation pits based on spatiotemporal characteristics and deep learning models[J]. Computers and Geotechnics, 2024, 168: 106149
doi: 10.1016/j.compgeo.2024.106149
[17]   SEON P, HWAN A, JUN P, et al Convolutional neural network-based safety evaluation method for structures with dynamic responses[J]. Expert Systems with Applications, 2020, 158: 113634
doi: 10.1016/j.eswa.2020.113634
[18]   CHEN C, TANG L, LU Y, et al Reconstruction of long-term strain data for structural health monitoring with a hybrid deep-learning and autoregressive model considering thermal effects[J]. Engineering Structures, 2023, 285: 116063
doi: 10.1016/j.engstruct.2023.116063
[19]   吴伟强. 基于CNN-LSTM的采空区地表沉降预测 [D]. 绵阳: 西南科技大学, 2022: 1–67.
WU Weiqiang. Prediction of surface settlement in goaf based on CNN-LSTM [D]. Mianyang: Southwest University of Science and Technology, 2022: 1–67.
[20]   洪宇超, 钱建固, 叶源新, 等 基于时空关联特征的CNN-LSTM模型在基坑工程变形预测中的应用[J]. 岩土工程学报, 2021, 43 (Suppl.2): 108- 111
HONG Yuchao, QIAN Jiangu, YE Yuanxin, et al Application of CNN-LSTM model based on spatiotemporal correlation characteristics in deformation prediction of excavation engineering[J]. Chinese Journal of Geotechnical Engineering, 2021, 43 (Suppl.2): 108- 111
[21]   LU Y, TANG L, CHEN C, et al Reconstruction of structural long-term acceleration response based on BiLSTM networks[J]. Engineering Structures, 2023, 285: 116000
doi: 10.1016/j.engstruct.2023.116000
[22]   JEONG S, FERGUSON M, HOU R, et al Sensor data reconstruction using bidirectional recurrent neural network with application to bridge monitoring[J]. Advanced Engineering Informatics, 2019, 42: 100991
doi: 10.1016/j.aei.2019.100991
[23]   VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. [S.l.]: NIPS, 2017: 6000–6010.
[24]   ZHANG W, ZHANG P, YU Y, et al Missing data repairs for traffic flow with self-attention generative adversarial imputation net[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23 (7): 7919- 7930
doi: 10.1109/TITS.2021.3074564
[25]   FAN G, HE Z, LI J Structural dynamic response reconstruction using self-attention enhanced generative adversarial networks[J]. Engineering Structures, 2023, 276: 115334
doi: 10.1016/j.engstruct.2022.115334
[26]   高墨通, 杨维芳, 刘祖昱, 等 结合卷积神经网络和注意力机制的LSTM采空区地表沉降预测方法[J]. 测绘通报, 2024, (6): 53- 58
GAO Motong, YANG Weifang, LIU Zuyu, et al LSTM goaf surface subsidence prediction method combining convolutional neural network and attention mechanism[J]. Bulletin of Surveying and Mapping, 2024, (6): 53- 58
doi: 10.13474/j.cnki.11-2246.2024.0610
[27]   HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural Computation, 1997, 9 (8): 1735- 1780
doi: 10.1162/neco.1997.9.8.1735
[28]   MAHMOODZADEH A, MOHAMMADI M, DARAEI A, et al Forecasting maximum surface settlement caused by urban tunneling[J]. Automation in Construction, 2020, 120: 103375
doi: 10.1016/j.autcon.2020.103375
[29]   SALMAN A G, HERYADI Y, ABDURAHMAN E, et al Single layer & multi-layer long short-term memory (LSTM) model with intermediate variables for weather forecasting[J]. Procedia Computer Science, 2018, 135: 89- 98
doi: 10.1016/j.procs.2018.08.153
[30]   FATHNEJAT H, AHMADI-NEDUSHAN B, HOSSEININEJAD S, et al A data-driven structural damage identification approach using deep convolutional-attention-recurrent neural architecture under temperature variations[J]. Engineering Structures, 2023, 276: 115311
doi: 10.1016/j.engstruct.2022.115311
[1] Zhihang ZHU,Yunfeng YAN,Donglian QI. Image generation for power personnel behaviors based on diffusion model with multimodal prompts[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(1): 43-51.
[2] Yue SUN,Xinglan ZHANG. Targeted adversarial attack method based on dual guidance[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(1): 81-89.
[3] Jizhong DUAN,Haiyuan LI. Multi-scale parallel magnetic resonance imaging reconstruction based on variational model and Transformer[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1826-1837.
[4] Fujian WANG,Zetian ZHANG,Xiqun CHEN,Dianhai WANG. Usage prediction of shared bike based on multi-channel graph aggregation attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1986-1995.
[5] Hong ZHANG,Xuecheng ZHANG,Guoqiang WANG,Panlong GU,Nan JIANG. Real-time positioning and control of soft robot based on three-dimensional vision[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1574-1582.
[6] Yishan LIN,Jing ZUO,Shuhua LU. Multimodal sentiment analysis based on multi-head self-attention mechanism and MLP-Interactor[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1653-1661.
[7] Shengju WANG,Zan ZHANG. Missing value imputation algorithm based on accelerated diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1471-1480.
[8] Dongping ZHANG,Dawei WANG,Shuji HE,Siliang TANG,Zhiyong LIU,Zhongqiu LIU. Remaining useful life prediction of aircraft engines based on cross-dimensional feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1504-1513.
[9] Yongqing CAI,Cheng HAN,Wei QUAN,Wudi CHEN. Visual induced motion sickness estimation model based on attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1110-1118.
[10] Lihong WANG,Xinqian LIU,Jing LI,Zhiquan FENG. Network intrusion detection method based on federated learning and spatiotemporal feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1201-1210.
[11] Huizhi XU,Xiuqing WANG. Perception of distance and speed of front vehicle based on vehicle image features[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1219-1232.
[12] Zan CHEN,Ran LI,Yuanjing FENG,Yongqiang LI. Video snapshot compressive imaging reconstruction based on temporal super-resolution[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 956-963.
[13] Li MA,Yongshun WANG,Yao HU,Lei FAN. Pre-trained long-short spatiotemporal interleaved Transformer for traffic flow prediction applications[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 669-678.
[14] Jianfei MA,Gangshuai JIA,Bo JIANG,Zheng CHEN,Xiaokang LING,Shaohui HE. Study on service safety of super-large-span tunnels considering combined defects of lining voids and thinning[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 698-705.
[15] Qiaohong CHEN,Menghao GUO,Xian FANG,Qi SUN. Image captioning based on cross-modal cascaded diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 787-794.