Please wait a minute...
浙江大学学报(理学版)  2020, Vol. 47 Issue (4): 507-520    DOI: 10.3785/j.issn.1008-9497.2020.04.014
旅游学     
游客情感计算的文本大数据挖掘方法比较研究
李君轶1,2, 任涛1,2, 陆路正1,2
1.陕西师范大学 地理科学与旅游学院,陕西 西安 710119
2.陕西省旅游信息科学重点实验室,陕西 西安 710119
A comparative study of big text data mining methods on tourist emotion computing
LI Junyi1,2, REN Tao1,2, LU Luzheng1,2
1.School of Geography and Tourism, Shaanxi Normal University, Xi’an 710119,China
2.Shaanxi Key Laboratory of Tourism Informatics, Xi’an 710119, China
 全文: PDF(2058 KB)   HTML  
摘要: 旅游文本大数据以其方便、快捷和低门槛的特点为游客情感计算提供了极大便利,已经成为旅游大数据的主要来源之一。基于大数据理论和情感理论,以文本大数据为数据源,在全面梳理国内外情感计算相关成果的基础上,利用人工智能中的逻辑/算法编程方法、机器学习方法、深度学习方法对旅游文本大数据进行挖掘,探索最佳的基于文本大数据的游客情感计算方法。研究发现:(1)基于情感词典的游客情感计算模型,其核心是构建情感词典和设计情感计算规则,方法简单,容易实现,适用语料范围广。(2)机器学习,用统计学方法抽取文本中的特征项,具有非线性特征,可靠性较线性特征的情感词典方法高。(3)基于深度学习技术的游客情感计算,效果良好,准确率在85%以上。训练多领域的文本语料易于移植,实用性强,且泛化能力好,较适合大数据时代游客情感计算研究。
关键词: 深度学习旅游文本大数据情感计算机器学习情感词典    
Abstract: The big tourism text data has greatly facilitated the emotional calculation of tourists with its convenience, rapidity and low threshold, and has become a main sources of tourism big data. Under the guidance of big data theory and emotional theory, we adopt the logic / algorithm programming method, machine learning method to mine the big tourism text data, and explore the best tourist emotional computing method. The research findings are as follows: (1) The core of the emotional calculation model for tourists based on sentiment dictionary is to construct emotional dictionary and design emotional calculation rules. The idea is easy to implement with a wide range of applicable corpus.(2)The machine learning method uses statistical methods to extract feature items in the text, and its non-linear feature improves the reliability of emotion calculation comparing with the linear feature of sentiment dictionary method. (3)The emotional calculation of tourists based on the deep learning has a good effect, and the accuracy rate has reached to 85% or higher. The training of multi-domain text corpus is easy to transplant, practical and general, and suitable for the emotional computing research of tourists in the era of big data.
Key words: the text of tourism big data    emotion dictionary    machine learning    deep learning    affective computing
收稿日期: 2019-06-10 出版日期: 2020-07-25
CLC:  F590  
基金资助: 国家自然科学基金面上项目(41571135);陕西省重点产业创新链(群)-社会发展领域项目(2019ZDLSF07-04);中央高校基本科研业务费专项资金资助项目(14SZZD02).
作者简介: 李君轶(1975—),ORCID:https://orcid.org/0000-0003-2466-4533,男,博士,教授,主要从事旅游者行为、旅游信息科学等研究,E-mail:lijunyi9@snnu.edu.cn.。
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
李君轶
任涛
陆路正

引用本文:

李君轶, 任涛, 陆路正. 游客情感计算的文本大数据挖掘方法比较研究[J]. 浙江大学学报(理学版), 2020, 47(4): 507-520.

LI Junyi, REN Tao, LU Luzheng. A comparative study of big text data mining methods on tourist emotion computing. Journal of Zhejiang University (Science Edition), 2020, 47(4): 507-520.

链接本文:

https://www.zjujournals.com/sci/CN/10.3785/j.issn.1008-9497.2020.04.014        https://www.zjujournals.com/sci/CN/Y2020/V47/I4/507

1 谢彦君. 旅游体验的两极情感模型:快乐-痛苦[J]. 财经问题研究, 2006(5): 88-92.DOI:10.3969/j.issn.1000-176X.2006.05.014 XIE Y J. Bipolar emotional model of tourism experience: Pleasure-pain[J]. Research on Financial and Economic Issues, 2006(5): 88-92. DOI:10.3969/j.issn.1000-176X.2006.05.014
2 刘丹萍,金程. 旅游中的情感研究综述[J]. 旅游科学, 2015,29(2):74-85.DOI:10.3969/j.issn.1006-575X.2015.02.007 LIU D P,JIN C. A survey of emotion research in tourism [J]. Tourism Science, 2015,29(2):74-85. DOI:10.3969/j.issn.1006-575X.2015.02.007
3 郝志刚. 移动大数据时代我国旅游发展的新思考[J]. 旅游学刊,2016,31(6):1-2. DOI:10.3969/j.issn.1002-5006.2016.06.001 HAO Z G. New thinking on China's tourism development in the era of mobile big data [J]. Tourism Tribune, 2016,31 (6): 1-2. DOI:10.3969/j.issn.1002-5006.2016.06.001
4 潘冰. 旅游大数据的发展和展望[J]. 旅游学刊, 2017,32(10):4-6.DOI:10.3969/j.issn.1002-5006.2017.10.001 PAN B. Development and prospect of tourism big data [J]. Tourism Tribune, 2017,32 (10): 4-6. DOI:10.3969/j.issn.1002-5006.2017.10.001
5 程励,张同颢,付阳.城市居民雾霾天气认知及其对城市旅游目的地选择倾向的影响[J].旅游学刊, 2015,30(10):37-47.DOI:10.3969/j.issn.1002-5006.2015.10.004 CHENG L, ZHANG T H, FU Y. Urban residents' cognition of haze weather and its influence on urban tourism destination selection [J]. Tourism Tribune, 2015,30 (10): 37-47.DOI:10.3969/j.issn.1002-5006.2015.10.004
6 MEDHAT W, HASSAN A, KORASHY H. Sentiment analysis algorithms and applications: A survey[J]. Ain Shams Engineering Journal, 2014, 5(4): 1093-1113. DOI:10.1016/j.asej.2014.04.011
7 HATZIVASSILOGLOU V, MCKEOWN K R. Predicting the semantic orientation of adjectives[C]//Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics. Madrid:Association for Computational Linguistics, 1997: 174-181. DOI:10.3115/979617.979640
8 TURNEY P D, LITTMAN M L. Measuring praise and criticism: Inference of semantic orientation from association[J]. ACM Transactions on Information Systems (TOIS), 2003, 21(4): 315-346.
9 ZAGIBALOV T, CARROLL J. Automatic seed word selection for unsupervised sentiment classification of Chinese text[C]//Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Manchester:Association for Computational Linguistics, 2008: 1073-1080.DOI:10.3115/1599081.1599216
10 HU M, LIU B. Mining and summarizing customer reviews[C]//Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2004: 168-177. DOI:10.1145/1014052.1014073
11 TAN S, WU G, TANG H, et al. A novel scheme for domain-transfer problem in the context of sentiment analysis[C]// Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management. Lisbon:ACM, 2007: 979-982. DOI:10.1145/1321440.1321590
12 王素格. 基于Web的评论文本情感计算问题研究[D]. 上海: 上海大学, 2008. WANG S G. Research on Emotion Computing of Comment Text Based on Web [D]. Shanghai: Shanghai University, 2008.
13 李君轶,张妍妍. 大数据引领游客情感体验研究[J]. 旅游学刊,2017,32(9):11-12. DOI:10.3669/j.issn.1006-575X.2017.09.006 LI J Y, ZHANG Y Y. Research on big data leading tourists' emotional experience [J]. Tourism Tribune, 2017, 32 (9): 11-12. DOI:10.3669/j.issn.1006-575X.2017.09.006
14 CHOI S, LEHTO X Y, MORRISON A M. Destination image representation on the web: Content analysis of Macau travel related websites[J]. Tourism Management,2007, 28(1):118-129.DOI:10.1016/j.tourman.2006.03.002
15 GOVERS R, GO F M, KUMAR K. Virtual destination image a new measurement approach[J]. Annals of Tourism Research, 2007,34(4):977-997.DOI:10.1016/j.annals.2007.06.001
16 RADOJEVIC T, STANISIC N, STANIC N. Ensuring positive feedback: Factors that influence customer satisfaction in the contemporary hospitality industry[J]. Tourism Management, 2015, 51:13-21.DOI:10.1016/j.tourman.2015.04.002
17 赵振斌,党娇. 基于网络文本内容分析的太白山背包旅游行为研究[J]. 人文地理, 2011(1):134-139.DOI:10.3969/j.issn.1003-2398.2011.01.027 ZHAO Z B, DANG J. Research on backpacking tourism behavior of Taibai Mountain based on content analysis of internet texts[J]. Human Geography, 2011 (1): 134-139.DOI:10.3969/j.issn.1003-2398.2011.01.027
18 高慧君,李君轶. 基于微博大数据的游客情感与气候舒适度关系研究——以西安市国内游客为例[J]. 陕西师范大学学报(自然科学版), 2017,45(1):110-117.DOI:10.15983/j.cnki.jsnu.2017.01.414 GAO H J, LI J Y. A study on the relationship between tourists' emotion and climate comfort based on micro blog big data -Taking Xi'an domestic tourists as an example [J]. Journal of Shaanxi Normal University(Natural Science Edition), 2017, 45(1): 110-117. DOI:10.15983/j.cnki.jsnu.2017.01.414
19 易善君,李君轶,李秀琴,等. 基于微博大数据的空气质量与居民情感相关性对比研究——以西安市和上海市为例[J]. 干旱区资源与环境, 2017,31(5):39-44.DOI:10.13448/j.cnki.jalre.2017.141 YI S J,LI J Y,LI X Q, et al. A comparative study on the correlation between air quality and residents' emotions based on big data of microblog-Taking Xi'an and Shanghai as examples [J]. Journal of Arid Land Resources and Environment, 2017,31 (5): 39-44.DOI:10.13448/j.cnki.jalre.2017.141
20 张思豆,李君轶. 基于微博大数据的游客情感与空气质量关系研究——以西安市为例[J]. 陕西师范大学学报(自然科学版), 2016,44(4):102-107. DOI:10.15983/j.cnki.jsnu.2016.04.443 ZHANG S D,LI J Y. Research on the relationship between tourists' emotion and air quality based on big data of microblog-Taking Xi'an as an example [J]. Journal of Shaanxi Normal University (Natural Science Edition), 2016,44 (4): 102-107.DOI:10.15983/j.cnki.jsnu.2016.04.443
21 张高军,李君轶,张柳.华山风景区旅游形象感知研究——基于游客网络日志的文本分析[J]. 旅游科学, 2011,25(4):87-94.DOI:10.3969/j.issn.1006-575X.2011.04.010 ZHANG G J,LI J Y,YANG L. A study on the perception of tourism image in Huashan scenic spot-A text analysis based on tourists' web logs [J]. Tourism Science, 2011,25 (4): 87-94.DOI:10.3969/j.issn.1006-575X.2011.04.010
22 刘逸,保继刚,陈凯琪.中国赴澳大利亚游客的情感特征研究——基于大数据的文本分析[J]. 旅游学刊, 2017,32(5):46-58.DOI:10.3969/j.issn.1002-5006.2017.05.010 LIU Y,BAO J G,CHEN K Q. Research on the emotional characteristics of Chinese tourists to Australia: Text analysis based on big data [J]. Tourism Tribune, 2017,32 (5): 46-58. DOI:10.3969/j.issn.1006-575X.2011.04.010
23 陈航,王跃伟.基于旅游者情感的目的地品牌评价研究——以互联网旅游日记为例[J].人文地理, 2018,33(2):154-160.DOI:10.13959/j.issn.1003-2398.2018.02.020 CHEN H,WANG Y W. Research on destination brand evaluation based on tourists' emotion-Taking internet tourism diary as an example [J]. Human Geography, 2018,33 (2): 154-160.DOI:10.13959/j.issn.1003-2398.2018.02.020
24 李萍,陈田,王甫园,等. 基于文本挖掘的城市旅游社区形象感知研究——以北京市为例[J]. 地理研究, 2017,36(6):1106-1122. LI P, CHEN T, WANG F Y, et al. Research on the image perception of urban tourism community based on text mining-Taking Beijing as an example [J]. Geographical Research, 2017,36 (6): 1106-1122.
25 陈龙,管子玉,何金红,等. 情感计算研究进展[J]. 计算机研究与发展, 2017,54(6):1150-1170. CHEN L,GUAN Z Y,HE J H, et al. Research progress of emotional computing [J]. Journal of Computer Research and Development, 2017,54 (6): 1150-1170.
26 魏韡,向阳,陈千. 中文文本情感分析综述[J]. 计算机应用, 2011,31(12):3321-3323. WEI W,XIANG Y, CHEN Q. An overview of Chinese text sentiment analysis [J]. Journal of Computer Applications, 2011,31 (12): 3321-3323.
27 REILLY M D. Free elicitation of descriptive adjectives for tourism image assessment[J]. Journal of Travel Research, 1990,28(3):21-26.DOI:10.1177/004728759002800405
28 ECHTNER C M, RITCHIE J R B. The measurement of destination image:An cempirical assessment[J]. Journal of Travel Research, 1993,31(3):3-13.
29 TAPACHAI N, WARYSZAK R. An examination of the role of beneficial image in tourist destination selection[J]. Journal of Travel Research, 2000,39(8):37-44. DOI:10.1177/004728750003900105
30 ECHTNER C M. The content of Third World tourism marketing: A 4A approach[J]. International Journal of Tourism Research, 2002,4(6):413-434.DOI:10.1002/jtr.401
31 ANDSAGER J L, DRZEWIECKA J A. Desirability of differences in destinations[J]. Annals of Tourism Research, 2002,29(2):401-421.DOI:10.1016/s0160-7383(01)00064-0
32 STEPCHENKOVA S, MORRISON A M. The destination image of Russia: From the online induced perspective[J]. Tourism Management, 2006, 27(5): 943-956.DOI:10.1016/j.tourman.2005.10.021
33 涂海丽,唐晓波. 基于在线评论的游客情感分析模型构建[J]. 现代情报, 2016,36(4):70-77. DOI:10.3969/j.issn.1008-0821.2016.04.013 TU H L,TANG X B. Construction of tourist sentiment analysis model based on online reviews [J]. Journal of Modern Information, 2016,36 (4): 70-77.DOI:10.3969/j.issn.1008-0821.2016.04.013
34 刘逸,保继刚,朱毅玲. 基于大数据的旅游目的地情感评价方法探究[J]. 地理研究, 2017,36(6):1091-1105. LIU Y,BAO J G,ZHU Y L. Research on the evaluation method of tourist destination emotion based on big data [J]. Geographical Research, 2017,36 (6): 1091-1105.
35 郑文英. 旅行目的地中文评论的情感分析研究[D]. 哈尔滨:哈尔滨工业大学, 2010. ZHENG W Y. Emotional Analysis of Chinese Travel Destination Reviews [D]. Harbin: Harbin Institute of Technology, 2010.
36 王新宇,阮立新. 基于机器学习的旅游大数据分析研究——以旅游网络评价情感分析为例[J]. 中国旅游评论, 2016(2): 53-61. WANG X Y,RUAN L X. Research on big data analysis of tourism based on machine learning-Taking emotional analysis of tourism network evaluation as an example [J]. China Tourism Review, 2016 (2): 53-61.
37 刘思叶,田原,冯雨宁,等. 游客微博主题情感分析方法比较研究[J]. 北京大学学报(自然科学版), 2018,54(4):687-692.DOI:10.13209/j.0479-8023.2018.011 LIU S Y,TIAN Y,FENG Y N, et al. A comparative study on the theme emotion analysis methods of tourist microblog [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018,54 (4): 687-692. DOI:10.13209/j.0479-8023.2018.011
38 何愉,卫陈泉,陆钰华. 基于深度神经网络与主题模型的文本情感分析——以上海迪士尼景区游客满意度调查为例[J]. 统计科学与实践, 2016(12):17-21.DOI:10.3969/j.issn.1674-8905.2016.12.004 HE Y,WEI C Q,LU Y H. Text emotion analysis based on in-depth neural network and theme model: A case study of tourist satisfaction survey in Shanghai Disneyland scenic spot [J]. Statistical Theory and Practice, 2016(12): 17-21. DOI:10.3969/j.issn.1674-8905.2016.12.004
39 毛超群. 基于改进情感词典的在线旅游评论文本情感计算研究[D]. 杭州: 浙江工商大学, 2018. MAO C Q. Research on Emotional Computing of Online Travel Review Text based on Improved Emotional Dictionary [D]. Hangzhou: Zhejiang Gongshang University, 2018.
40 SHEN Y, LI S, ZHENG L, et al. Emotion mining research on micro-blog[C]//2009 1st IEEE Symposium on Web Society. Lanzhou:IEEE, 2009: 71-75. DOI:10.1109/sws.2009.5271711
41 张伟舒,吕云翔. 微博情感倾向算法的改进与实现[J]. 知识管理论坛, 2013(9):21-27.DOI:10.7536/j.issn.2095-5472.2013.09.004 ZHANG W S,LYU Y X. Improvement and implementation of Weibo emotional tendency algorithm [J]. Knowledge Management Forum, 2013 (9): 21-27. DOI:10.7536/j.issn.2095-5472.2013.09.004
42 HU M Q, LIU B. Opinion extraction and summarization on the web[C]//Proceeding of the 21st National Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2006(2): 1621-1624. DOI:10.1080/07075332.2014.951951
43 MILLER G A. WordNet: A lexical database for English[J]. Communications of the ACM, 1995,38(11): 39-41. DOI:10.1080/07075332.2014.951951
44 张伟,刘缙,郭先珍. 学生褒贬义词典[M]. 北京: 中国大百科全书出版社, 2004. ZHANG W, LIU J, GUO X Z. Dictionary of Students' Evaluation and Derogation [M]. Beijing: China Encyclopedia Press, 2004.
45 史继林,朱英贵. 褒义词词典[M]. 成都: 四川辞书出版社, 2005. SHI J L,ZHU Y G. Dictionary of Commendatory Words [M]. Chengdu: Sichuan Dictionary Press, 2005.
46 于静. 基于微博大数据的游客情感及时空变化研究[D]. 西安: 陕西师范大学, 2015. YU J. Research on Tourists'Emotion and Time-Space Change based on Micro Blog Big Data [D]. Xi'an: Shaanxi Normal University, 2015.
[1] 徐圣嘉,苏程,朱孔阳,章孝灿. 基于深度学习的岩石薄片矿物自动识别方法[J]. 浙江大学学报(理学版), 2022, 49(6): 743-752.
[2] 常晓洁,张华. 一种基于V-TGRU模型的资源调度算法[J]. 浙江大学学报(理学版), 2022, 49(4): 467-473.
[3] 郑素佩,闫佳,宋学力,陈荧. 求解大规模矛盾方程组的最小二乘支持向量机算法[J]. 浙江大学学报(理学版), 2022, 49(4): 435-442.
[4] 刘华玲,张国祥,马俊. 图嵌入算法研究进展[J]. 浙江大学学报(理学版), 2022, 49(4): 443-456.
[5] 郏丽丽,孙婷婷. 紫色球杆菌视紫红质光谱特性的机器学习研究[J]. 浙江大学学报(理学版), 2022, 49(3): 280-286.
[6] 钱立辉, 王斌, 郑云飞, 章佳杰, 李马丁, 于冰. 基于图像深度预测的景深视频分类算法[J]. 浙江大学学报(理学版), 2021, 48(3): 282-288.
[7] 陈园琼, 邹北骥, 张美华, 廖望旻, 黄嘉儿, 朱承璋. 医学影像处理的深度学习可解释性研究进展[J]. 浙江大学学报(理学版), 2021, 48(1): 18-29.
[8] 傅颖颖, 张丰, 杜震洪, 刘仁义. 融合图卷积神经网络和注意力机制的PM2.5小时浓度多步预测[J]. 浙江大学学报(理学版), 2021, 48(1): 74-83.
[9] 潘水洋, 刘俊玮, 王一鸣. 基于神经网络的股票收益率预测研究[J]. 浙江大学学报(理学版), 2019, 46(5): 550-555.
[10] 陈善雄, 王小龙, 韩旭, 刘云, 王明贵. 一种基于深度学习的古彝文识别方法[J]. 浙江大学学报(理学版), 2019, 46(3): 261-269.
[11] 黄婕, 张丰, 杜震洪, 刘仁义, 曹晓裴. 基于RNN-CNN集成深度学习模型的PM2.5小时浓度预测[J]. 浙江大学学报(理学版), 2019, 46(3): 370-379.
[12] 胡伟俭, 陈为, 冯浩哲, 张天平, 朱正茂, 潘巧明. 应用于平扫CT图像肺结节检测的深度学习方法综述[J]. 浙江大学学报(理学版), 2017, 44(4): 379-384.