Computer Technology |
|
|
|
|
Microblog sentiment analysis based on collaborative learning under loose conditions |
SUN Nian1, LI Yu-qiang1, LIU Ai-hua2, LIU Chun1, LI Wei-wei1 |
1. School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430063, China;
2. School of Energy and Power Engineering, Wuhan University of Technology, Wuhan 430063, China |
|
|
Abstract Aiming at the facts that two completely redundant feature views are required in the traditional collaborative learning and the redundancy of features are not reached in most cases, a collaborative learning framework under loose conditions was proposed. The support vector machine algorithm and the long short-term memory algorithm were used to establish the microblog feature view based on the vector model and the word vector model. The collaborative learning was conducted on these two models. A new selection strategy of unmarked samples which combined the uncertain strategy in active learning and the maximum certainty-factor was proposed. The information contained in unlabeled samples was fully used. The experimental results show that compared with the traditional selection strategy, the selection strategy improves the quality of categorizer and manages to complete Chinese microblog sentiment analysis with the proposed collaborative learning framework under loose conditions.
|
Received: 14 November 2017
Published: 23 August 2018
|
|
基于松散条件下协同学习的中文微博情感分析
传统的协同学习算法需要2个充分冗余的特征视图,而在多数情况下达不到特征充分冗余的要求,为此提出松散条件下的协同学习框架.利用支持向量机算法和长短期记忆网络(LSTM)算法分别建立基于向量空间模型的微博特征视图和基于语义相关的词向量特征视图,在2个视图上进行协同学习.针对未标注样本的选择,提出结合主动学习中的不确定策略和协同学习中的最高置信度策略的选择策略,从不同角度充分利用未标注样本中包含的信息量.实验结果表明,在中文微博情感极性研究领域,提出的选择策略与传统选择策略相比,能够提高分类器的性能,并且利用松散条件下的协同学习框架实现微博情感分析性能.
|
|
[1] HERLOCKER J L, KONSTAN J A, TERVEEN L G, et al. Evaluating collaborative filtering recommender system[J]. ACM Transactions on Information Systems, 2004, 22(1):5-53.
[2] 徐蕾, 杨成, 姜春晓, 等. 协同过滤推荐系统中的用户博弈[J]. 计算机学报, 2016, 39(6):1176-1189 XU Lei, YANG Cheng, JIANG Chun-xiao et al. The user game in the collaborative filtering recommendation system[J]. Chinese Journal of Computers, 2016, 39(6):1176-1189
[3] 梁军, 柴玉梅, 原慧斌, 等. 基于深度学习的微博情感分析[J]. 中文信息学报, 2014, 28(5):155-161 LIANG Jun, CHAI Yu-mei, YUAN Hui-bin, et al. The analysis of microblog sentiment based on deep learn ing[J]. Journal of Chinese Information Processing, 2014, 28(5):155-161
[4] ZHENG C, SHENG L, DAI N. Chinese microblog emotion classification based on class sequential rules[J]. Computer Engineering, 2016, 42(2):184-189.
[5] CHANG Y C, CHU C H, CHEN C, et al. Linguistic template extraction for recognizing reader-emotion[J]. Journal of Chinese Computational Linguistics, 2016, 21(1):29-50.
[6] 唐慧丰, 谭松波, 程学旗. 基于监督学习的中文情感分类技术比较研究[J]. 中文信息学报, 2007, 6(2):88-94 TANG Hui-feng, TAN Song-bo, CHENG Xue-qi. Research on sentiment classification of Chinese reviews based on supervised machine learning techniques[J]. Journal of Chinese Information Processing, 2007, 6(2):88-94
[7] BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[C]//Conference on Computational Learning Theory. Madison:COLT, 1998:92-100
[8] GOLDMAN S, ZHOU Y. Enhancing supervised learning with unlabeled data[C]//Proceedings of the Seventeenth International Conference on Machine Learning. San Francisco:ICML, 2000:327-334
[9] ZHOU Z, LI M. Tri-training:exploiting unlabeled data using three classifiers[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11):1529-1541.
[10] WANG W, ZHOU Z H. On multi-view active learning and the combination with semi-supervised learning[C]//Proceedings of the Twenty-five International Conference on Machine Learling. Helsinki:DBLP, 2008:1152-1159
[11] YU N. Exploring co-training strategies for opinion detection[J]. Journal of the Association for Information Science and Technology, 2014, 65(10):2098-2110.
[12] LEWIS D D, GALE W A. A sequential algorithm for training text classifiers[C]//International Conference on Computational Linguistics. Dublin:ICCL, 1994:3-12
[13] 居胜峰, 王中卿, 李寿山等. 情感分类中不同主动学习策略比较研究[C]//中国计算语言学研究前沿进展. 洛阳:CCL, 2011:506-511 JU Sheng-feng, WANG Zhong-qing, LI Shou-shan, et al. A comparative study of different active learning strategies for sentiment classification[C]//Advances of Computational Linguistics in China. Luoyang:CCL, 2011:506-511
[14] NGUYEN H T, SMEULDERS A. Active learning using pre-clustering[C]//2004 Proceedings of the Twenty-first International Conference on Machine Learning. Banff:ICML, 2004:79
[15] HAJMOHAMMADI M S, IBRAHIM R, SELAMAT A, et al. Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabeled samples[J]. Information Sciences, 2015, 317(C):67-77.
[16] HASTIE T, TIBSHIRANI R, FRIEDMAN J H, et al. The elements of statistical learning, second edition:data mining, inference, and prediction[J]. Mathematical Intelligencer, 2009, 27(2):83-85.
[17] LI W, LI Y, WANG Y. Chinese microblog sentiment analysis based on sentiment features[C]//Asia-Pacific Web Conference. Suzhou:APWeb, 2016:385-388 |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|