Please wait a minute...
浙江大学学报(工学版)
计算机技术﹑电信技术     
观点挖掘综述
陈旻1, 朱凡微2, 吴明晖2, 应晶1,2
1. 浙江大学 计算机科学与技术学院,浙江 杭州 310027; 2. 浙江大学城市学院 计算机科学与工程学系, 浙江 杭州 310015
Survey of opinion mining
CHEN Min1, ZHU Fan-wei2, WU Ming-hui2, YING Jing1,2
1. College of Computer Science, Zhejiang University, Hangzhou 310027, China; 2. Department of Computer Science and Engineering, Zhejiang University City College, Hangzhou 310015, China
 全文: PDF(2081 KB)   HTML
摘要:

随着互联网技术的飞速发展,博客、社交网络、微博等平台的出现,使得人们在网络上发表个人观点变得更为方便快捷.从海量的数据中提取出消费者、商家、政府等群体需要的信息,并对其进行分析、总结,从而形成有针对性的、可读性强的分析结果.以观点挖掘的流程为线索,从观点提取、极性分析、观点总结等方面对观点挖掘做了较为全面的分析.其中观点提取部分介绍了域依赖性的概念,并例举观点句提取所采用的主要技术;极性分析部分对有监督学习与无监督学习、定性判断极性与定量判断极性进行对比;而观点总结部分则简要介绍了基于方面、观点对比和关键语句等总结的形式.还介绍了在不同的挖掘粒度上所运用的技术,涉及信息检索、自然语言处理、机器学习等.最后,总结了观点挖掘技术中依然存在的具有挑战性的问题,并预测未来观点挖掘领域的研究趋势.

Abstract:

With the rapid development of Internet technology, the use of the network in people’s daily life has become increasingly popular. The appearance of blog, social networking, twitter and some other platforms, makes people express their personal views on the Internet more conveniently. Analyzing these views can provide the most authentic and useful information for consumers, businesses and government. As a result, it becomes particularly important to extract, analyze and summarize information from the massive data, which will be formed as a targeted, readable analysis result for users. In this paper, with the process of opinion mining for clues, we firstly gave a comprehensive analysis on opinion mining from the view of opinion extraction, polarity analyze and opinion summarization, describing the tasks required to complete in each step. Opinion extraction section introduced the concept of domain-dependency, and shows some methods used in extraction. In the polarity analysis section, supervised learning and unsupervised learning, qualitative judgment and quantitative judgment were compared. Summary section gave a brief introduction of several summary forms, such as aspect based, opinion comparison and key works.. Then we introduced different methods on different mining levels, such as document level, sentence level, word level and aspect level, involving techniques like information retrieval, natural language processing, machine learning, etc. Later, we summarized some remaining challenging problems on opinion mining, and proposed some improved methods respectively. For example, sometimes objective sentences also contain opinions which often be overlooked by researchers. At the same time, we still unable to determine the polarity of some complex sentence. In the end of the paper, we predicted the future research trends of opinion mining.

出版日期: 2014-08-01
:  TP 391  
基金资助:

浙江省重点科技创新团队项目(2010R50009).

通讯作者: 应晶,男,教授,博导     E-mail: yingj@zucc.edu.cn
作者简介: 陈旻(1988—),女,硕士生,主要研究方向为软件工程、信息检索.E-mail:chenmin1107@zju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  

引用本文:

陈旻, 朱凡微, 吴明晖, 应晶. 观点挖掘综述[J]. 浙江大学学报(工学版), 10.3785/j.issn.1008-973X.2014.08.016.

CHEN Min, ZHU Fan-wei, WU Ming-hui, YING Jing. Survey of opinion mining. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 10.3785/j.issn.1008-973X.2014.08.016.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2014.08.016        http://www.zjujournals.com/eng/CN/Y2014/V48/I8/1461

[1] RAYMOND N, PATRICIA C, DENILSON B, et al. Perspectives on business intelligence [R]. Ontario: University of Waterloo, 2013.
[2] DAVE K, LAWRENCE S, PENNOCK D. Mining the peanut gallery: opinion extraction and semantic classification of product reviews [C] ∥ Proc of WWW ′03. New York: ACM, 2003: 519-529.
[3] YANG Hui, LUO Si, CALLAN J. Knowledge transfer and opinion detection in the TREC2006 blog track [C] ∥ Proc of TREC. Pennsylvania: NIST, 2006: 163-168.
[4] BLITZER J, DREDZE M, PEREIRA F. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification [C] ∥ Proc of the 45th Annual Meeting of the Association of Computational Linguistics. Prague: Cambridge University Press, 2007: 440-447.
[5] OWSLEY S, SOOD S, HAMMOND K. Domain specific affective classification of documents [C] ∥ Proc of CAAW ′06. California: AAAI, 2006: 181-183.
[6] SEKI K, UEHARA K. Adaptive subjective triggers for opinionated document retrieval [C] ∥ Proc of WSDM ′09. New York: ACM, 2009: 25-33.
[7] NA S, LEE Y, NAM S, et al. Improving opinion retrieval based on query-specific sentiment Lexicon [J]. Advances in Information Retrieval, 2009, 5478(8): 734-738.
[8] MISSEN M, BOUGHANEM M, CABANAC G. Opinion mining: reviewed from word to document level [J]. Social Network Analysis and Mining, 2013, 3(1): 107-125.
[9] HU Min-qing, LIU Bing. Mining opinion features in customer reviews [C] ∥ Proc of National Conference on Artificial. California: AAAI, 2004: 755-760.
[10] LIU Li-zhen, LV Zhi-xin. Extract product features in Chinese web for opinion mining [J]. Journal of Software, 2013, 8(3): 627-632.
[11] PAPPAS N, KATSIMPRAS G, STAMATATOS E. Extracting informative textual parts from web pages containing user-generated content [C] ∥ Proc of the 12th International Conference on Knowledge Management and Knowledge Technologies. New York: ACM, 2012.
[12] PAPPAS N, KATSIMPRAS G, STAMATATOS E. Distinguishing the popularity between topics a system for up-to-date opinion retrieval and mining in the web [J]. Computational Linguistics and Intelligent Text Processing, 2013, 7817: 197-209.
[13] CHAOVALIT P, ZHOU Li-na. Movie review mining: a comparison between supervised and unsupervised classification approaches [C] ∥ Proc of the 38th Hawaii International Conference on System Sciences. Hawaii: IEEE, 2005: 112c112c.
[14] HU Min-qing, LIU Bing. Mining and summarizing customer reviews [C] ∥ Proc of KDD ’04. New York: ACM, 2004: 168-177.
[15] KIM S, HOVY E. Determining the sentiment of opinions [C] ∥ Proc of COLING ′04. Stroudsburg: ACL, 2004
[16] ESULI A, SEBASTIANI F. SentiWordNet: a publicly available lexical resource for opinion mining [C] ∥ Proc of LREC. Genoa: European Language Resources Association, 2006: 417-422.
[17] LIU Bing. Sentiment analysis and opinion mining [M]. Florida: Morgan & Claypool Publishers, 2012.
[18] KIM H, ZHAI Cheng-xiang. Generating comparative summaries of contradictory opinions in text [C] ∥ Proc of CIKM ′09. New York: ACM, 2009: 385-394.
[19] BEINEKE P, HASTIE T, MANNING C, et al. An exploration of sentiment summarization [C] ∥ Proc of AAAI ′03. California: AAAI, 2003: 12-15.
[20] SEkI Y, EGUCHI K, KANDO N, et al. Opinion focused summarization and its analysis at DUC 2006 [C] ∥ Proc of HLT-NAACL ′06. New York: Johns Hopkins U., 2006: 122130.
[21] TURNEY P. Thumbs up or thumbs down: semantic orientation applied to unsupervised classification of reviews [C] ∥ Proc of ACL ′02. Stroudsburg: Cambridge University Press, 2002: 417-424.
[22] MISHNE G. Multiple ranking strategies for opinion retrieval in blogs [C] ∥ Proc of TREC. Pennsylvania: NIST, 2006: 78-81.
[23] GERANI S, CARMAN M, CRESTANI F. Investigating learning approaches for blog post opinion retrieval [J]. Advances in Information Retrieval, 2009, Volume 5478: 313-324.
[24] HE Ben, MACDONALD C, HE Ji-yin, et al. An effective statistical approach to blog post opinion retrieval [C] ∥ Proc of CIKM ’08. New York: ACM, 2008: 1063-1072.
[25] ZHANG Zi-qiong, YE Qiang, LAW R, et al. Automatic detection of subjective sentences based on chinese subjective patterns [J]. Cutting-edge Research Topics on Multiple Criteria Decision Making, 2009, 35(6): 29-36.
[26] RILOFF E, WIEBE J, WIKSON T. Learning subjective nouns using extraction pattern bootstrapping [C] ∥ Proc of CONLL ′03. Stroudsburg: ACL, 2003: 25-32.
[27] ZHANG Yi, XU Wei. Exact maximum likelihood estimation for word mixtures [C] ∥ Proc of ICML’02. Sydney: Morgan Kaufmann Publishers, 2002.
[28] YU Hong, HATZIVASSILOGLOU V. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences [C] ∥ Proc of EMNLP ′03. Stroudsburg: ACL, 2003: 129-136.
[29] GAMON M, AUE A, CORSTON-OLIVER S. Pulse: mining customer opinions from free text [J]. Advances in Intelligent Data Analysis VI, 2005, 3646(9): 121-132.
[30] JINDAL N, LIU Bing. Identifying comparative sentences in text documents [C] ∥ Proc of SIGIR ′06. New York: ACM, 2006: 244-251.
[31] JINDAL N, LIU Bing. Mining comparative sentences and relations [C] ∥ Proc of AAAI ′06. California: AAAI, 2006: 1331-1336.
[32] VALITUTTI A, STRAPPARAVVA C, STOCK O. Developing affective lexical resources [J]. PsychNology Journal, 2004, 2(1): 61-83.
[33] MOHAMMAD S, DUNNE C, DORR B. Generating highcoverage semantic orientation lexicons from overtly marked words and a thesaurus [C] ∥ Proc of EMNLP ′09. Stroudsburg: ACL, 2009: 599-608.
[34] KAMPS J, MARX M, MOKKEN R, et al. Using WordNet to measure semantic orientation of adjectives [C] ∥ Proc of LREC ′04. Genoa: European Language Resources Association, 2004: 1115-1118.
[35] ESULI A, SEBASTIANI F. Determining the semantic orientation of terms through gloss classification [C] ∥ Proc of CIKM ′05. New York: ACM, 2005: 617-624.
[36] HATZIVASSILOGLOU V, MCKEOWN K. Predicting the semantic orientation of adjectives [C] ∥ Proc of ACL ′98. Stroudsburg: ACL, 1998: 174-181.
[37] WILSON T, WIEBE J, HOFFMANN P. Recognizing contextual polarity in phrase-level sentiment analysis [C] ∥ Proc of HLT ′05. Stroudsburg: ACL, 2005: 347-354.
[38] MARIO C, ANDREA B, ILARIA T. Good location, terrible food: detecting feature sentiment in user-generated reviews [J]. Social Network Analysis and Mining, 2013, 3(4): 1149-1163.
[39] BARONI M, VEGNADUZZO S. Identifying subjective adjectives through web-based mutual information [C] ∥ Proc of KONVENS ′04. Vienna: UTpublications, 2004: 17-24.
[40] CAMBRiIA E, HAVASI C, HUSSAIN A. SenticNet 2: a semantic and affective resource for opinion mining and sentiment analysis [C] ∥ Proc of FLAIRS ′12. California: AAAI, 2012: 202-207.
[41] BLAIR-GOLDENSOHN S, HANNAN K. Building a sentiment summarizer for local service reviews [C] ∥ Proc of WWW ′08. Beijing: ACM, 2008.
[42] ZHUANG Li, JING Feng, ZHU Xiao-yan. Movie review mining and summarization [C] ∥ Proc of CIKM ′06. New York: ACM, 2006: 43-50.
[43] WU Yuan-bin, ZHANG Qi, HUANG Xuan-jing, et al. Phrase dependency parsing for opinion mining [C] ∥ Proc of EMNLP ′09. Stroudsburg: ACL, 2009: 1533-1541.
[44] JAKOB N, GUREVYCH I. Extracting opinion targets in a singleand cross-domain setting with conditional random fields [C] ∥ Proc of EMNLP ′10. Stroudsburg: ACL, 2010: 1035-1045.
[45] LI Fang-tao, HUANG Min-lie, ZHU Xiao-yan. Sentiment analysis with global topics and local dependency [C] ∥ Proc of AAAI ′10. California: AAAI, 2010: 1371-1376.
[46] ZHEN Hai, CHANG Kui-yu, KIM J. Implicit feature identification via co-occurrence association rule mining [J]. Computational Linguistics and Intelligent Text Processing, 2011, 6608(2): 393-404.
[47] AGGARWAL C, ZHAI Cheng-xiang. Mining text data [M].New York: Springer, 2012.
[48] BROOKE J, TOFILOSKI M, TABOADA M. Cross-linguistic sentiment analysis: from english to spanish [C] ∥ Proc of International Conference RANLP. Borovets: ACL, 2009: 50-54.
[49] WAN Xiao-jun. Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis [C] ∥ Proc of EMNLP ′08. Stroudsburg: ACL, 2008: 553561.
[50] WAN Xiao-jun. Co-training for cross-lingual sentiment classification [C] ∥ Proc of ACL ′09. Stroudsburg: ACL, 2009: 235-243.
[51] KIM S, HOVY E. Identifying and analyzing judgment opinions [C] ∥ Proc of HLT-NAACL ′06. Stroudsburg: ACL, 2006: 200-207.
[52] MIHALCEA R, BANEA C, WIEBE J. Learning multilingual subjective language via cross-lingual projections [C] ∥ Proc of ACL ′07. Stroudsburg: ACL, 2007: 976-983.
[53] NARAYANAN R, LIU Bing, Choudhary A. Sentiment analysis of conditional sentences [C] ∥ Proc of EMNLP ′09. Stroudsburg: ACL,2009: 180-189.

[1] 何雪军, 王进, 陆国栋, 刘振宇, 陈立, 金晶. 基于三角网切片及碰撞检测的工业机器人三维头像雕刻[J]. 浙江大学学报(工学版), 2017, 51(6): 1104-1110.
[2] 王桦, 韩同阳, 周可. 公安情报中基于关键图谱的群体发现算法[J]. 浙江大学学报(工学版), 2017, 51(6): 1173-1180.
[3] 尤海辉, 马增益, 唐义军, 王月兰, 郑林, 俞钟, 吉澄军. 循环流化床入炉垃圾热值软测量[J]. 浙江大学学报(工学版), 2017, 51(6): 1163-1172.
[4] 毕晓君, 王佳荟. 基于混合学习策略的教与学优化算法[J]. 浙江大学学报(工学版), 2017, 51(5): 1024-1031.
[5] 王亮, 於志文, 郭斌. 基于双层多粒度知识发现的移动轨迹预测模型[J]. 浙江大学学报(工学版), 2017, 51(4): 669-674.
[6] 廖苗, 赵于前, 曾业战, 黄忠朝, 张丙奎, 邹北骥. 基于支持向量机和椭圆拟合的细胞图像自动分割[J]. 浙江大学学报(工学版), 2017, 51(4): 722-728.
[7] 黄正宇, 蒋鑫龙, 刘军发, 陈益强, 谷洋. 基于融合特征的半监督流形约束定位方法[J]. 浙江大学学报(工学版), 2017, 51(4): 655-662.
[8] 蒋鑫龙, 陈益强, 刘军发, 忽丽莎, 沈建飞. 面向自闭症患者社交距离认知的可穿戴系统[J]. 浙江大学学报(工学版), 2017, 51(4): 637-647.
[9] 穆晶晶, 赵昕玥, 何再兴, 张树有. 基于凹凸变换与圆周拟合的重叠气泡轮廓重构[J]. 浙江大学学报(工学版), 2017, 51(4): 714-721.
[10] 戴彩艳, 陈崚, 李斌, 陈伯伦. 复杂网络中的抽样链接预测[J]. 浙江大学学报(工学版), 2017, 51(3): 554-561.
[11] 刘磊, 杨鹏, 刘作军. 采用多核相关向量机的人体步态识别[J]. 浙江大学学报(工学版), 2017, 51(3): 562-571.
[12] 郭梦丽, 达飞鹏, 邓星, 盖绍彦. 基于关键点和局部特征的三维人脸识别[J]. 浙江大学学报(工学版), 2017, 51(3): 584-589.
[13] 王海军, 葛红娟, 张圣燕. 基于核协同表示的快速目标跟踪算法[J]. 浙江大学学报(工学版), 2017, 51(2): 399-407.
[14] 张亚楠, 陈德运, 王莹洁, 刘宇鹏. 基于增量图形模式匹配的动态冷启动推荐方法[J]. 浙江大学学报(工学版), 2017, 51(2): 408-415.
[15] 刘宇鹏, 乔秀明, 赵石磊, 马春光. 统计机器翻译中大规模特征的深度融合[J]. 浙江大学学报(工学版), 2017, 51(1): 46-56.