Please wait a minute...
Journal of Zhejiang University (Science Edition)  2020, Vol. 47 Issue (4): 507-520    DOI: 10.3785/j.issn.1008-9497.2020.04.014
Tourism     
A comparative study of big text data mining methods on tourist emotion computing
LI Junyi1,2, REN Tao1,2, LU Luzheng1,2
1.School of Geography and Tourism, Shaanxi Normal University, Xi’an 710119,China
2.Shaanxi Key Laboratory of Tourism Informatics, Xi’an 710119, China
Download: HTML (   PDF(2058KB)
Export: BibTeX | EndNote (RIS)      

Abstract  The big tourism text data has greatly facilitated the emotional calculation of tourists with its convenience, rapidity and low threshold, and has become a main sources of tourism big data. Under the guidance of big data theory and emotional theory, we adopt the logic / algorithm programming method, machine learning method to mine the big tourism text data, and explore the best tourist emotional computing method. The research findings are as follows: (1) The core of the emotional calculation model for tourists based on sentiment dictionary is to construct emotional dictionary and design emotional calculation rules. The idea is easy to implement with a wide range of applicable corpus.(2)The machine learning method uses statistical methods to extract feature items in the text, and its non-linear feature improves the reliability of emotion calculation comparing with the linear feature of sentiment dictionary method. (3)The emotional calculation of tourists based on the deep learning has a good effect, and the accuracy rate has reached to 85% or higher. The training of multi-domain text corpus is easy to transplant, practical and general, and suitable for the emotional computing research of tourists in the era of big data.

Key wordsthe text of tourism big data      emotion dictionary      machine learning      deep learning      affective computing     
Received: 10 June 2019      Published: 25 July 2020
CLC:  F590  
  TP391  
Cite this article:

LI Junyi, REN Tao, LU Luzheng. A comparative study of big text data mining methods on tourist emotion computing. Journal of Zhejiang University (Science Edition), 2020, 47(4): 507-520.

URL:

https://www.zjujournals.com/sci/EN/Y2020/V47/I4/507


游客情感计算的文本大数据挖掘方法比较研究

旅游文本大数据以其方便、快捷和低门槛的特点为游客情感计算提供了极大便利,已经成为旅游大数据的主要来源之一。基于大数据理论和情感理论,以文本大数据为数据源,在全面梳理国内外情感计算相关成果的基础上,利用人工智能中的逻辑/算法编程方法、机器学习方法、深度学习方法对旅游文本大数据进行挖掘,探索最佳的基于文本大数据的游客情感计算方法。研究发现:(1)基于情感词典的游客情感计算模型,其核心是构建情感词典和设计情感计算规则,方法简单,容易实现,适用语料范围广。(2)机器学习,用统计学方法抽取文本中的特征项,具有非线性特征,可靠性较线性特征的情感词典方法高。(3)基于深度学习技术的游客情感计算,效果良好,准确率在85%以上。训练多领域的文本语料易于移植,实用性强,且泛化能力好,较适合大数据时代游客情感计算研究。

关键词: 深度学习,  旅游文本大数据,  情感计算,  机器学习,  情感词典 
1 谢彦君. 旅游体验的两极情感模型:快乐-痛苦[J]. 财经问题研究, 2006(5): 88-92.DOI:10.3969/j.issn.1000-176X.2006.05.014 XIE Y J. Bipolar emotional model of tourism experience: Pleasure-pain[J]. Research on Financial and Economic Issues, 2006(5): 88-92. DOI:10.3969/j.issn.1000-176X.2006.05.014
2 刘丹萍,金程. 旅游中的情感研究综述[J]. 旅游科学, 2015,29(2):74-85.DOI:10.3969/j.issn.1006-575X.2015.02.007 LIU D P,JIN C. A survey of emotion research in tourism [J]. Tourism Science, 2015,29(2):74-85. DOI:10.3969/j.issn.1006-575X.2015.02.007
3 郝志刚. 移动大数据时代我国旅游发展的新思考[J]. 旅游学刊,2016,31(6):1-2. DOI:10.3969/j.issn.1002-5006.2016.06.001 HAO Z G. New thinking on China's tourism development in the era of mobile big data [J]. Tourism Tribune, 2016,31 (6): 1-2. DOI:10.3969/j.issn.1002-5006.2016.06.001
4 潘冰. 旅游大数据的发展和展望[J]. 旅游学刊, 2017,32(10):4-6.DOI:10.3969/j.issn.1002-5006.2017.10.001 PAN B. Development and prospect of tourism big data [J]. Tourism Tribune, 2017,32 (10): 4-6. DOI:10.3969/j.issn.1002-5006.2017.10.001
5 程励,张同颢,付阳.城市居民雾霾天气认知及其对城市旅游目的地选择倾向的影响[J].旅游学刊, 2015,30(10):37-47.DOI:10.3969/j.issn.1002-5006.2015.10.004 CHENG L, ZHANG T H, FU Y. Urban residents' cognition of haze weather and its influence on urban tourism destination selection [J]. Tourism Tribune, 2015,30 (10): 37-47.DOI:10.3969/j.issn.1002-5006.2015.10.004
6 MEDHAT W, HASSAN A, KORASHY H. Sentiment analysis algorithms and applications: A survey[J]. Ain Shams Engineering Journal, 2014, 5(4): 1093-1113. DOI:10.1016/j.asej.2014.04.011
7 HATZIVASSILOGLOU V, MCKEOWN K R. Predicting the semantic orientation of adjectives[C]//Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics. Madrid:Association for Computational Linguistics, 1997: 174-181. DOI:10.3115/979617.979640
8 TURNEY P D, LITTMAN M L. Measuring praise and criticism: Inference of semantic orientation from association[J]. ACM Transactions on Information Systems (TOIS), 2003, 21(4): 315-346.
9 ZAGIBALOV T, CARROLL J. Automatic seed word selection for unsupervised sentiment classification of Chinese text[C]//Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Manchester:Association for Computational Linguistics, 2008: 1073-1080.DOI:10.3115/1599081.1599216
10 HU M, LIU B. Mining and summarizing customer reviews[C]//Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2004: 168-177. DOI:10.1145/1014052.1014073
11 TAN S, WU G, TANG H, et al. A novel scheme for domain-transfer problem in the context of sentiment analysis[C]// Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management. Lisbon:ACM, 2007: 979-982. DOI:10.1145/1321440.1321590
12 王素格. 基于Web的评论文本情感计算问题研究[D]. 上海: 上海大学, 2008. WANG S G. Research on Emotion Computing of Comment Text Based on Web [D]. Shanghai: Shanghai University, 2008.
13 李君轶,张妍妍. 大数据引领游客情感体验研究[J]. 旅游学刊,2017,32(9):11-12. DOI:10.3669/j.issn.1006-575X.2017.09.006 LI J Y, ZHANG Y Y. Research on big data leading tourists' emotional experience [J]. Tourism Tribune, 2017, 32 (9): 11-12. DOI:10.3669/j.issn.1006-575X.2017.09.006
14 CHOI S, LEHTO X Y, MORRISON A M. Destination image representation on the web: Content analysis of Macau travel related websites[J]. Tourism Management,2007, 28(1):118-129.DOI:10.1016/j.tourman.2006.03.002
15 GOVERS R, GO F M, KUMAR K. Virtual destination image a new measurement approach[J]. Annals of Tourism Research, 2007,34(4):977-997.DOI:10.1016/j.annals.2007.06.001
16 RADOJEVIC T, STANISIC N, STANIC N. Ensuring positive feedback: Factors that influence customer satisfaction in the contemporary hospitality industry[J]. Tourism Management, 2015, 51:13-21.DOI:10.1016/j.tourman.2015.04.002
17 赵振斌,党娇. 基于网络文本内容分析的太白山背包旅游行为研究[J]. 人文地理, 2011(1):134-139.DOI:10.3969/j.issn.1003-2398.2011.01.027 ZHAO Z B, DANG J. Research on backpacking tourism behavior of Taibai Mountain based on content analysis of internet texts[J]. Human Geography, 2011 (1): 134-139.DOI:10.3969/j.issn.1003-2398.2011.01.027
18 高慧君,李君轶. 基于微博大数据的游客情感与气候舒适度关系研究——以西安市国内游客为例[J]. 陕西师范大学学报(自然科学版), 2017,45(1):110-117.DOI:10.15983/j.cnki.jsnu.2017.01.414 GAO H J, LI J Y. A study on the relationship between tourists' emotion and climate comfort based on micro blog big data -Taking Xi'an domestic tourists as an example [J]. Journal of Shaanxi Normal University(Natural Science Edition), 2017, 45(1): 110-117. DOI:10.15983/j.cnki.jsnu.2017.01.414
19 易善君,李君轶,李秀琴,等. 基于微博大数据的空气质量与居民情感相关性对比研究——以西安市和上海市为例[J]. 干旱区资源与环境, 2017,31(5):39-44.DOI:10.13448/j.cnki.jalre.2017.141 YI S J,LI J Y,LI X Q, et al. A comparative study on the correlation between air quality and residents' emotions based on big data of microblog-Taking Xi'an and Shanghai as examples [J]. Journal of Arid Land Resources and Environment, 2017,31 (5): 39-44.DOI:10.13448/j.cnki.jalre.2017.141
20 张思豆,李君轶. 基于微博大数据的游客情感与空气质量关系研究——以西安市为例[J]. 陕西师范大学学报(自然科学版), 2016,44(4):102-107. DOI:10.15983/j.cnki.jsnu.2016.04.443 ZHANG S D,LI J Y. Research on the relationship between tourists' emotion and air quality based on big data of microblog-Taking Xi'an as an example [J]. Journal of Shaanxi Normal University (Natural Science Edition), 2016,44 (4): 102-107.DOI:10.15983/j.cnki.jsnu.2016.04.443
21 张高军,李君轶,张柳.华山风景区旅游形象感知研究——基于游客网络日志的文本分析[J]. 旅游科学, 2011,25(4):87-94.DOI:10.3969/j.issn.1006-575X.2011.04.010 ZHANG G J,LI J Y,YANG L. A study on the perception of tourism image in Huashan scenic spot-A text analysis based on tourists' web logs [J]. Tourism Science, 2011,25 (4): 87-94.DOI:10.3969/j.issn.1006-575X.2011.04.010
22 刘逸,保继刚,陈凯琪.中国赴澳大利亚游客的情感特征研究——基于大数据的文本分析[J]. 旅游学刊, 2017,32(5):46-58.DOI:10.3969/j.issn.1002-5006.2017.05.010 LIU Y,BAO J G,CHEN K Q. Research on the emotional characteristics of Chinese tourists to Australia: Text analysis based on big data [J]. Tourism Tribune, 2017,32 (5): 46-58. DOI:10.3969/j.issn.1006-575X.2011.04.010
23 陈航,王跃伟.基于旅游者情感的目的地品牌评价研究——以互联网旅游日记为例[J].人文地理, 2018,33(2):154-160.DOI:10.13959/j.issn.1003-2398.2018.02.020 CHEN H,WANG Y W. Research on destination brand evaluation based on tourists' emotion-Taking internet tourism diary as an example [J]. Human Geography, 2018,33 (2): 154-160.DOI:10.13959/j.issn.1003-2398.2018.02.020
24 李萍,陈田,王甫园,等. 基于文本挖掘的城市旅游社区形象感知研究——以北京市为例[J]. 地理研究, 2017,36(6):1106-1122. LI P, CHEN T, WANG F Y, et al. Research on the image perception of urban tourism community based on text mining-Taking Beijing as an example [J]. Geographical Research, 2017,36 (6): 1106-1122.
25 陈龙,管子玉,何金红,等. 情感计算研究进展[J]. 计算机研究与发展, 2017,54(6):1150-1170. CHEN L,GUAN Z Y,HE J H, et al. Research progress of emotional computing [J]. Journal of Computer Research and Development, 2017,54 (6): 1150-1170.
26 魏韡,向阳,陈千. 中文文本情感分析综述[J]. 计算机应用, 2011,31(12):3321-3323. WEI W,XIANG Y, CHEN Q. An overview of Chinese text sentiment analysis [J]. Journal of Computer Applications, 2011,31 (12): 3321-3323.
27 REILLY M D. Free elicitation of descriptive adjectives for tourism image assessment[J]. Journal of Travel Research, 1990,28(3):21-26.DOI:10.1177/004728759002800405
28 ECHTNER C M, RITCHIE J R B. The measurement of destination image:An cempirical assessment[J]. Journal of Travel Research, 1993,31(3):3-13.
29 TAPACHAI N, WARYSZAK R. An examination of the role of beneficial image in tourist destination selection[J]. Journal of Travel Research, 2000,39(8):37-44. DOI:10.1177/004728750003900105
30 ECHTNER C M. The content of Third World tourism marketing: A 4A approach[J]. International Journal of Tourism Research, 2002,4(6):413-434.DOI:10.1002/jtr.401
31 ANDSAGER J L, DRZEWIECKA J A. Desirability of differences in destinations[J]. Annals of Tourism Research, 2002,29(2):401-421.DOI:10.1016/s0160-7383(01)00064-0
32 STEPCHENKOVA S, MORRISON A M. The destination image of Russia: From the online induced perspective[J]. Tourism Management, 2006, 27(5): 943-956.DOI:10.1016/j.tourman.2005.10.021
33 涂海丽,唐晓波. 基于在线评论的游客情感分析模型构建[J]. 现代情报, 2016,36(4):70-77. DOI:10.3969/j.issn.1008-0821.2016.04.013 TU H L,TANG X B. Construction of tourist sentiment analysis model based on online reviews [J]. Journal of Modern Information, 2016,36 (4): 70-77.DOI:10.3969/j.issn.1008-0821.2016.04.013
34 刘逸,保继刚,朱毅玲. 基于大数据的旅游目的地情感评价方法探究[J]. 地理研究, 2017,36(6):1091-1105. LIU Y,BAO J G,ZHU Y L. Research on the evaluation method of tourist destination emotion based on big data [J]. Geographical Research, 2017,36 (6): 1091-1105.
35 郑文英. 旅行目的地中文评论的情感分析研究[D]. 哈尔滨:哈尔滨工业大学, 2010. ZHENG W Y. Emotional Analysis of Chinese Travel Destination Reviews [D]. Harbin: Harbin Institute of Technology, 2010.
36 王新宇,阮立新. 基于机器学习的旅游大数据分析研究——以旅游网络评价情感分析为例[J]. 中国旅游评论, 2016(2): 53-61. WANG X Y,RUAN L X. Research on big data analysis of tourism based on machine learning-Taking emotional analysis of tourism network evaluation as an example [J]. China Tourism Review, 2016 (2): 53-61.
37 刘思叶,田原,冯雨宁,等. 游客微博主题情感分析方法比较研究[J]. 北京大学学报(自然科学版), 2018,54(4):687-692.DOI:10.13209/j.0479-8023.2018.011 LIU S Y,TIAN Y,FENG Y N, et al. A comparative study on the theme emotion analysis methods of tourist microblog [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018,54 (4): 687-692. DOI:10.13209/j.0479-8023.2018.011
38 何愉,卫陈泉,陆钰华. 基于深度神经网络与主题模型的文本情感分析——以上海迪士尼景区游客满意度调查为例[J]. 统计科学与实践, 2016(12):17-21.DOI:10.3969/j.issn.1674-8905.2016.12.004 HE Y,WEI C Q,LU Y H. Text emotion analysis based on in-depth neural network and theme model: A case study of tourist satisfaction survey in Shanghai Disneyland scenic spot [J]. Statistical Theory and Practice, 2016(12): 17-21. DOI:10.3969/j.issn.1674-8905.2016.12.004
39 毛超群. 基于改进情感词典的在线旅游评论文本情感计算研究[D]. 杭州: 浙江工商大学, 2018. MAO C Q. Research on Emotional Computing of Online Travel Review Text based on Improved Emotional Dictionary [D]. Hangzhou: Zhejiang Gongshang University, 2018.
40 SHEN Y, LI S, ZHENG L, et al. Emotion mining research on micro-blog[C]//2009 1st IEEE Symposium on Web Society. Lanzhou:IEEE, 2009: 71-75. DOI:10.1109/sws.2009.5271711
41 张伟舒,吕云翔. 微博情感倾向算法的改进与实现[J]. 知识管理论坛, 2013(9):21-27.DOI:10.7536/j.issn.2095-5472.2013.09.004 ZHANG W S,LYU Y X. Improvement and implementation of Weibo emotional tendency algorithm [J]. Knowledge Management Forum, 2013 (9): 21-27. DOI:10.7536/j.issn.2095-5472.2013.09.004
42 HU M Q, LIU B. Opinion extraction and summarization on the web[C]//Proceeding of the 21st National Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2006(2): 1621-1624. DOI:10.1080/07075332.2014.951951
43 MILLER G A. WordNet: A lexical database for English[J]. Communications of the ACM, 1995,38(11): 39-41. DOI:10.1080/07075332.2014.951951
44 张伟,刘缙,郭先珍. 学生褒贬义词典[M]. 北京: 中国大百科全书出版社, 2004. ZHANG W, LIU J, GUO X Z. Dictionary of Students' Evaluation and Derogation [M]. Beijing: China Encyclopedia Press, 2004.
45 史继林,朱英贵. 褒义词词典[M]. 成都: 四川辞书出版社, 2005. SHI J L,ZHU Y G. Dictionary of Commendatory Words [M]. Chengdu: Sichuan Dictionary Press, 2005.
46 于静. 基于微博大数据的游客情感及时空变化研究[D]. 西安: 陕西师范大学, 2015. YU J. Research on Tourists'Emotion and Time-Space Change based on Micro Blog Big Data [D]. Xi'an: Shaanxi Normal University, 2015.
[1] Shengjia XU,Cheng SU,Kongyang ZHU,Xiaocan ZHANG. Automatic identification of mineral in petrographic thin sections based on images using a deep learning method[J]. Journal of Zhejiang University (Science Edition), 2022, 49(6): 743-752.
[2] Supei ZHENG,Jia YAN,Xueli SONG,Ying CHEN. A least square support vector machine algorithm for solving huge contradictory equations[J]. Journal of Zhejiang University (Science Edition), 2022, 49(4): 435-442.
[3] Hualing LIU,Guoxiang ZHANG,Jun MA. Research progress of graph embedding algorithms[J]. Journal of Zhejiang University (Science Edition), 2022, 49(4): 443-456.
[4] Xiaojie CHANG,Hua ZHANG. A resource scheduling algorithm based on V-TGRU model[J]. Journal of Zhejiang University (Science Edition), 2022, 49(4): 467-473.
[5] Lili JIA,Tingting SUN. A machine learning study on gloeobacter violaceus rhodopsin spectral properties[J]. Journal of Zhejiang University (Science Edition), 2022, 49(3): 280-286.
[6] QIAN Lihui, WANG Bin, ZHENG Yunfei, ZHANG Jiajie, LI Mading, YU Bing. Depth of field videos classification based on image depth prediction[J]. Journal of Zhejiang University (Science Edition), 2021, 48(3): 282-288.
[7] CHEN Yuanqiong, ZOU Beiji, ZHANG Meihua, LIAO Wangmin, HUANG Jiaer, ZHU Chengzhang. A review on deep learning interpretability in medical image processing[J]. Journal of Zhejiang University (Science Edition), 2021, 48(1): 18-29.
[8] FU Yingying, ZHANG Feng, DU Zhenhong, LIU Renyi. Multi-step prediction of PM2.5 hourly concentration by fusing graph convolution neural network and attention mechanism[J]. Journal of Zhejiang University (Science Edition), 2021, 48(1): 74-83.
[9] PAN Shuiyang, LIU Junwei, WANG Yiming. Forecasting stock returns with artificial neural networks.[J]. Journal of Zhejiang University (Science Edition), 2019, 46(5): 550-555.
[10] Shanxiong CHEN, Xiaolong WANG, Xu HAN, Yun LIU, Minggui WANG. A recognition method of Ancient Yi character based on deep learning[J]. Journal of Zhejiang University (Science Edition), 2019, 46(3): 261-269.
[11] Jie HUANG, Feng ZHANG, Zhenhong DU, Renyi LIU, Xiaopei CAO. Hourly concentration prediction of PM2.5 based on RNN-CNN ensemble deep learning model[J]. Journal of Zhejiang University (Science Edition), 2019, 46(3): 370-379.
[12] HU Weijian, CHEN Wei, FENG Haozhe, ZHANG Tianping, ZHU Zhengmao, PAN Qiaoming. A survey of depth learning methods for detecting lung nodules by CT images[J]. Journal of Zhejiang University (Science Edition), 2017, 44(4): 379-384.