Please wait a minute...
Journal of Zhejiang University (Science Edition)  2023, Vol. 50 Issue (4): 508-520    DOI: 10.3785/j.issn.1008-9497.2023.04.014
Tourism     
ARIMA-LSTM model based on least square weighting to predict number of inbound tourists: A case study of Shanghai
Junfeng KANG1(),Yue FU1,Lei FANG2(),Mimi LI3,Yujing XIE2,Chaoyang ZHOU4
1.School of Civil and Surveying & Mapping Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,Jiangxi Province,China
2.Department of Environmental Science and Engineering,Fudan University,Shanghai 200433,China
3.School of Hotel and Tourism Management,The Hong Kong Polytechnic University,Hong Kong 999077,China
4.Jiangxi Provincial Defense Science and Technology Information and Satellite Application Center,Nanchang 330036,China
Download: HTML( 3 )   PDF(4684KB)
Export: BibTeX | EndNote (RIS)      

Abstract  

In order to reduce the secondary impact of the COVID-19 on the tourism industry, accurate prediction of the demand of inbound tourism market during the epidemic period can provide a scientific basis for later recovery and development of tourism. Taking Shanghai as the study area, the number of inbound tourists, major source countries, Google search index, confirmed cases of the epidemic and other data were selected to quantitatively analyze the spatial characteristics and the temporal trend of inbound tourism before and after the epidemic, the ARIMA-LSTM combination model weighted by the least square method was used to predict the number of inbound tourists after the epidemic. The results show that: (1) before and after the outbreak of the epidemic, the Asian tourists occupied the core position of the inbound tourism market, and the proportion of traditional inbound tourists and non-traditional inbound tourists is about 9∶1; (2) the number of inbound tourists demonstrates a long-term positive correlation and Granger causality with the Google search index, but there is no significant correlation with the confirmed cases of the epidemic; (3) by comparing the model evaluation indicators, it is found that when the R2 value of ARIMA-LSTM model is higher than 0.8, the model fits well, and the prediction error is smaller than that of a single model, and the prediction accuracy is higher, which means that the model can be uniformly applied to the recovery prediction of tourist numbers before, during and after the epidemic; (4) the number of inbound tourists from 2021-2024 is predicted, and it shows that the tourism trend during this period presents an obvious U-shaped change. After the comprehensive release of the epidemic in December 2022, the tourism industry began to gradually recover, and it is expected that the number of inbound tourists will return to the preepidemic tourism level by the end of 2024, that is, the recovery period is about one and a half years.



Key wordsCOVID-19      Shanghai tourism forecast      ARIMA-LSTM model      least squares method      Google index     
Received: 22 August 2022      Published: 17 July 2023
CLC:  F 590  
Corresponding Authors: Lei FANG     E-mail: junfeng.kang@jxust.edu.cn;fanglei@fudan.edu.cn
Cite this article:

Junfeng KANG,Yue FU,Lei FANG,Mimi LI,Yujing XIE,Chaoyang ZHOU. ARIMA-LSTM model based on least square weighting to predict number of inbound tourists: A case study of Shanghai. Journal of Zhejiang University (Science Edition), 2023, 50(4): 508-520.

URL:

https://www.zjujournals.com/sci/EN/Y2023/V50/I4/508


基于最小二乘法赋权的ARIMA-LSTM模型预测入境旅游人数——以上海市为例

为降低新冠病毒感染疫情大流行对旅游业的二次冲击,对疫情防控期间入境旅游市场的需求进行准确预测可为后期旅游业复苏提供科学依据。以上海市为研究区域,选取入境旅游人数、主要客源国、谷歌搜索指数、新增确诊病例数等数据,定量分析疫情前后入境旅游人数的空间变化特征及时间变化趋势,并用基于最小二乘法赋权的ARIMA-LSTM模型预测疫情后的入境旅游人数。结果表明:(1)疫情发生前后,亚洲客源市场一直占据入境旅游市场的核心地位,且传统入境游客与非传统入境游客的比例约为9∶1;(2)入境旅游人数与谷歌搜索指数存在长期正相关及格兰杰因果关系,与确诊病例数无明显相关性;(3)通过对比模型评价指标发现,当ARIMA-LSTM模型的R2大于0.8时,拟合较好,预测误差较单一模型小,预测精度较单一模型高,适用于疫情前、中、后期的旅游人数恢复预测;(4)对2021—2024年入境旅游人数进行恢复预测,发现该期间入境旅游人数呈明显的U形曲线。自2022年12月疫情全面放开后,旅游业逐步恢复,预计入境旅游人数在2024年12月恢复至疫情前水平,即需1.5 a的恢复期。


关键词: 新冠病毒感染,  上海旅游预测,  ARIMA-LSTM模型,  最小二乘法,  谷歌搜索指数 
人数与指数均值中位数标准差方差最小值最大值
入境旅游人数/人次535 354713 658316 085.469.99×101017 047874 615
景点谷歌搜索指数42.8149.0023.09533.158.0078.00
酒店谷歌搜索指数55.9068.0024.85617.6413.0084.00
旅游谷歌搜索指数44.6452.0019.94397.7510.0072.00
天气谷歌搜索指数59.3269.0022.46504.5320.0090.00
Table 1 Statistics on the number of inbound tourists in Shanghai and the Google index
类型均值中位数标准差方差最小值最大值
上海市境外输入病例106104492 35917177
上海市总计确诊病例1 3661 332613375 8425162 312
上海市新增确诊病例117105522 71520185
日本新增确诊病例54 44642 222532902.84×1091 946155 123
韩国新增确诊病例11 5327 566111791.25×10872342 064
美国新增确诊病例2 057 5531 458 5331 864 9093.48×1012213 3106 529 781
Table 2 Epidemic-related data statistics
Fig.1 LSTM structure
Fig.2 ARIMA-LSTM model frame diagram
Fig.3 Changes of inbound tourists from different source countries in Shanghai
Fig.4 Proportion of major source countries of inbound tourists in Shanghai from 2017 to 2020
Fig.5 Composition of the number of tourists entering Shanghai from four continents
Fig.6 Google search keyword filter
Fig.7 Comparison of the trend of the number of inbound tourists in Shanghai and the Google search index (normalized) from 2017 to 2021
Fig.8 Comparison of the trend of the number of confirmed cases and imported cases abroad
Fig.9 Correlation coefficient between the number of inbound tourists in Shanghai and the keyword Google search index and newly confirmed cases
原假设滞后阶数FP结论
上海旅游关键词不是入境旅游人数的格兰杰原因52.205 80.016 6拒绝原假设
上海酒店关键词不是入境旅游人数的格兰杰原因33.017 10.038 4拒绝原假设
上海景点关键词不是入境旅游人数的格兰杰原因53.004 30.020 3拒绝原假设
上海天气关键词不是入境旅游人数的格兰杰原因22.464 50.044 7拒绝原假设
上海本地新增确诊病例数不是入境旅游人数的格兰杰原因22.805 80.100 1接受原假设
日本新增确诊病例数不是入境旅游人数的格兰杰原因53.999 80.135 1接受原假设
韩国新增确诊病例数不是入境旅游人数的格兰杰原因53.058 10.478 0接受原假设
美国新增确诊病例数不是入境旅游人数的格兰杰原因51.843 90.505 4接受原假设
Table 3 Granger causality test results
Fig.10 Effect of different models on predicting the number of inbound tourists before and after the epidemic
模型疫情前疫情后
RMSEMSEMAER2RMSEMSEMAER2
LSTM0.2130.4630.6420.8570.1410.4000.4890.876
ARIMA0.1600.3870.4810.8190.3160.4870.6570.805
ARIMA-LSTM0.2700.4490.5900.9330.0290.3010.3910.994
Table 4 Evaluation results of prediction accuracy of different models before and after the epidemic
Fig.11 Forecast results of the number of inbound tourists in Shanghai from 2022 to 2024
[1]   任欢, 刘婷, 康俊锋. 一种基于百度指数的城市日游客规模预测方法[J]. 浙江大学学报(理学版), 2020, 47(6): 753-761. DOI:10.3785/j.issn.1008-9497. 2020.06.014
REN H, LIU T, KANG J F. A method for predicting the scale of daily tourists in cities based on Baidu index[J]. Journal of Zhejiang University (Science Edition), 2020, 47(6): 753-761. DOI:10.3785/j.issn.1008-9497.2020.06.014
doi: 10.3785/j.issn.1008-9497.2020.06.014
[2]   卢璐, 孙根年. 2008年至2018年我国大陆地区入境旅游的危机周期及市场归因[J]. 浙江大学学报(理学版), 2021, 48(3): 377-390. DOI:10.3785/j.issn. 1008-9497.2021.03.014
LU L, SUN G N. The crisis cycle and market attribution of inbound tourism in mainland country from 2008 to 2018[J]. Journal of Zhejiang University(Science Edition), 2021, 48(3): 377-390. DOI:10.3785/j.issn.1008-9497.2021.03.014
doi: 10.3785/j.issn.1008-9497.2021.03.014
[3]   SHARMA G D, THOMAS A, PAUL J. Reviving tourism industry post-COVID-19: A resilience-based framework[J]. Tourism Management Perspectives, 2021, 37: 100786. DOI:10.1016/j.tmp.2020.100786
doi: 10.1016/j.tmp.2020.100786
[4]   VOLGGER M, TAPLIN R, AEBLI A. Recovery of domestic tourism during the COVID-19 pandemic: An experimental comparison of interventions[J]. Journal of Hospitality and Tourism Management, 2021, 48: 428-440. DOI:10.1016/j.jhtm.2021.07.015
doi: 10.1016/j.jhtm.2021.07.015
[5]   ZHANG H, SONG H, WEN L, et al. Forecasting tourism recovery amid COVID-19[J]. Annals of Tourism Research, 2021, 87(4): 103149. DOI:10. 1016/j.annals.2021.103149
doi: 10. 1016/j.annals.2021.103149
[6]   KUMAR A, MISRA S C, CHAN F T S. Leveraging AI for advanced analytics to forecast altered tourism industry parameters: A COVID-19 motivated study[J]. Expert Systems with Applications, 2022, 210: 118628. DOI:10.1016/j.eswa.2022.118628
doi: 10.1016/j.eswa.2022.118628
[7]   SONG H, QIU R T R, PARK J. A review of research on tourism demand forecasting: Launching the annals of tourism research curated collection on tourism demand forecasting[J]. Annals of Tourism Research, 2019, 75: 338-362. DOI:10.1016/j.annals.2018.12.001
doi: 10.1016/j.annals.2018.12.001
[8]   GHU F L. A fractionally integrated autoregressive moving average approach to forecasting tourism demand[J]. Tourism Management, 2008, 29(1): 79-88. DOI:10.1016/j.tourman.2007.04.003
doi: 10.1016/j.tourman.2007.04.003
[9]   JANGHEE C, SEUNG C D, LEE T H. Forecasting tourism demand of Jeju Island using GAM and ARMA[J]. Korean Management Consulting Review, 2018, 18(2): 187-194.
[10]   CHEN J, HUANG M, FU J. Comparison of China PR inbound tourism forecast methods-ARIMA-based model, BP neural network model and BP-ARIMA mixed model[J]. Basic & Clinical Pharmacology & Toxicology, 2020, 127: 96-96.
[11]   DEININGER M, KOELLNER T, BREY T, et al. Towards mapping and assessing Antarctic marine ecosystem services:The Weddell sea case study[J]. Ecosystem Services, 2016, 22: 174-192. DOI:10. 1016/j.ecoser.2016.11.001
doi: 10. 1016/j.ecoser.2016.11.001
[12]   AYDIN M. The impacts of political stability, renewable energy consumption, and economic growth on tourism in Turkey: New evidence from Fourier Bootstrap ARDL approach[J]. Renewable Energy, 2022, 190: 467-473. DOI:10.1016/j.renene.2022. 03.144
doi: 10.1016/j.renene.2022. 03.144
[13]   CHATZIANTONIOU I, FILIS G, EECKELS B, et al. Oil prices, tourism income and economic growth: A structural VAR approach for European Mediterranean countries[J]. Tourism Management, 2013, 36: 331-341. DOI:10.1016/j.tourman.2012. 10.012
doi: 10.1016/j.tourman.2012. 10.012
[14]   LAW R, LI G, FONG D K C, et al. Tourism demand forecasting: A deep learning approach[J]. Annals of Tourism Research, 2019, 75: 410-423. DOI:10.1016/j.annals.2019.01.014
doi: 10.1016/j.annals.2019.01.014
[15]   ZHANG Y, TANG Z. PSO-weighted random forest for attractive tourism spots recommendation[J]. Future Generation Computer Systems, 2022, 127: 421-425. DOI:10.1016/j.future.2021.09.029
doi: 10.1016/j.future.2021.09.029
[16]   HONG W C, DONG Y, CHEN L Y, et al. SVR with hybrid chaotic genetic algorithms for tourism demand forecasting[J]. Applied Soft Computing, 2011, 11(2): 1881-1890. DOI:10.1016/j.asoc. 2010.06.003
doi: 10.1016/j.asoc. 2010.06.003
[17]   FAN G F, JIN X R, HONG W C. Application of COEMD-S-SVR model in tourism demand forecasting and economic behavior analysis: The case of Sanya city[J]. Journal of the Operational Research Society, 2022, 73(7): 1474-1486. DOI:10.1080/01605682.2021.1915192
doi: 10.1080/01605682.2021.1915192
[18]   TEIXEIRA J P, FERNANDES P O. Tourism time series forecast different ANN architectures with time index input[J]. Procedia Technology, 2012, 5: 445-454. DOI:10.1016/j.protcy.2012.09.049
doi: 10.1016/j.protcy.2012.09.049
[19]   TASYUREK M, CELIK M. RNN-GWR: A geographically weighted regression approach for frequently updated data[J]. Neurocomputing, 2020, 399: 258-270. DOI:10.1016/j.neucom.2020.02.058
doi: 10.1016/j.neucom.2020.02.058
[20]   FENG L, HAO Y K. Optimization algorithm of tourism security early warning information system based on long short-term memory (LSTM)[J]. Computational Intelligence and Neuroscience, 2021, 2021: 9984003. DOI:10.1155/2021/9984003
doi: 10.1155/2021/9984003
[21]   MO K C, SHIN S H, HLEE S, et al. Online tourism review:Three phases for successful destination relationships[J]. Asia Pacific Journal of Information Systems, 2015, 25(4): 746-762. DOI:10.14329/apjis.2015.25.4.746
doi: 10.14329/apjis.2015.25.4.746
[22]   SHERAFATIAN-JAHROMI R, OTHMAN M S, LAW S H, et al. Tourism and CO2 emissions nexus in Southeast Asia: New evidence from panel estimation[J]. Environment Development and Sustainability, 2017, 19(4): 1407-1423. DOI:10.1007/s10668-016-9811-x
doi: 10.1007/s10668-016-9811-x
[23]   ASLANARGUN A, MAMMADOV M, YAZICI B, et al. Comparison of ARIMA, neural networks and hybrid models in time series: Tourist arrival forecasting[J]. Journal of Statistical Computation and Simulation, 2007, 77(1): 29-53. DOI:10.1080/10629360600564874
doi: 10.1080/10629360600564874
[24]   CHEN K Y. Combining linear and nonlinear model in forecasting tourism demand[J]. Expert Systems with Applications, 2011, 38(8): 10368-10376. DOI:10. 1016/j.eswa.2011.02.049
doi: 10. 1016/j.eswa.2011.02.049
[25]   LEE J. A reformulation of weighted least squares estimators[J]. American Statistician, 2009, 63(1): 49-55. DOI:10.1198/tast.2009.0011
doi: 10.1198/tast.2009.0011
[26]   SUN C, JI S. The least squares estimator of random variables under sublinear expectations[J]. Journal of Mathematical Analysis and Applications, 2017, 451(2): 906-923. DOI:10.1016/j.jmaa.2017.02.020
doi: 10.1016/j.jmaa.2017.02.020
[27]   LI X, LAW R, XIE G, et al. Review of tourism forecasting research with internet data[J]. Tourism Management, 2021: 83: 104245. DOI:10.1016/j.tourman.2020.104245
doi: 10.1016/j.tourman.2020.104245
[28]   SUN S, WEI Y, TSUI K L, et al. Forecasting tourist arrivals with machine learning and internet search index[J]. Tourism Management, 2019, 70: 1-10. DOI:10.1016/j.tourman.2018.07.010
doi: 10.1016/j.tourman.2018.07.010
[29]   YANG Y, FAN Y, JIANG L, et al. Search query and tourism forecasting during the pandemic: When and where can digital footprints be helpful as predictors?[J]. Annals of Tourism Research, 2022, 93: 103365. DOI:10.1016/j.annals.2022.103365
doi: 10.1016/j.annals.2022.103365
[30]   WANG Y, GUO Y. Forecasting method of stock market volatility in time series data based on mixed model of ARIMA and XGBoost[J]. China Communications, 2020, 17(3): 205-221. DOI:10. 23919/jcc.2020.03.017
doi: 10. 23919/jcc.2020.03.017
[31]   YU Y, SI X, HU C, et al. A review of recurrent neural networks: LSTM cells and network architectures[J]. Neural Computation, 2019, 31(7): 1235-1270. DOI:10.1162/neco_a_01199
doi: 10.1162/neco_a_01199
[32]   FRAME J M, KRATZERT F, RANEY A, et al. Post-processing the national water model with long short-term memory networks for streamflow predictions and model diagnostics[J]. Journal of the American Water Resources Association, 2021, 57(6): 885-905. DOI:10.1111/1752-1688.12964
doi: 10.1111/1752-1688.12964
[33]   唐弘久, 保继刚. 我国主要入境客源地游客的时空特征及影响因素[J]. 经济地理, 2018, 38(9): 222-230. DOI:10.15957/j.cnki.jjdl.2018.09.026
TANG H J, BAO J G. The spatiotemporal characteristics and influencing factors of tourists from the main inbound tourist sources in my country[J]. Economic Geography, 2018, 38(9): 222-230. DOI:10.15957/j.cnki.jjdl.2018.09.026
doi: 10.15957/j.cnki.jjdl.2018.09.026
[34]   张国平, 刘晓鹰. 基于旅游目的分组的城镇居民国内旅游消费构成演变趋势探讨[J]. 商业时代, 2014 (2): 31-33. DOI:10.3969/j.issn.1002-5863.2014.02.012
ZHANG G P, LIU X Y. Discussion on the evolution trend of domestic tourism consumption composition of urban residents based on tourism purpose grouping[J]. The Age of Business, 2014(2): 31-33. DOI:10. 3969/j.issn.1002-5863.2014.02.012
doi: 10. 3969/j.issn.1002-5863.2014.02.012
[35]   JIN X C, QU M, BAO J. Impact of crisis events on Chinese outbound tourist flow: A framework for post-events growth[J]. Tourism Management, 2019, 74: 334-344. DOI:10.1016/j.tourman.2019.04.011
doi: 10.1016/j.tourman.2019.04.011
[36]   AREF F. Sense of community and participation for tourism development[J]. Life Science Journal-Acta Zhengzhou University Overseas Edition, 2011, 8(1): 20-25.
[37]   KIM H J, CHEN M H, JANG S S. Tourism expansion and economic development: The case of Taiwan[J]. Tourism Management, 2006, 27(5): 925-933. DOI:10.1016/j.tourman.2005.05.011
doi: 10.1016/j.tourman.2005.05.011
[1] Qinling YAN,Peiyu LIU. Impact of meteorological factors and control measures on the spread of COVID-19 epidemic[J]. Journal of Zhejiang University (Science Edition), 2023, 50(2): 144-152.
[2] ZHAO Lining, LI Junyi. A study on the influence of residential environment on anxiety of urban residents during COVID-19 epidemic[J]. Journal of Zhejiang University (Science Edition), 2021, 48(5): 642-650.
[3] XU Tingting, LI Gang, GAO Xing, WANG Jiaobei, WANG Yu, ZHANG Qianxi. Spatio-temporal evolution and influencing factors of COVID-19 epidemic in Zhejiang province[J]. Journal of Zhejiang University (Science Edition), 2021, 48(3): 356-367.
[4] ZHANG Xiaodong, HAN Haoying, CHEN Yu, LI Xiancan, LUO Guona. A Study on the characteristics and influencing factors of hospital spatial distribution in China[J]. Journal of Zhejiang University (Science Edition), 2021, 48(1): 84-92.
[5] Ye XingdeJiang Jinsheng. A New Pseudospectral Approximationfor. the Biharmonic Boundary Value Problem[J]. Journal of Zhejiang University (Science Edition), 1995, 22(1): 13-21.