Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2016, Vol. 17 Issue (11): 1186-1198    DOI: 10.1631/FITEE.1500283
    
一种观点挖掘新词语权重过程性能分析
G. R. Brindha, P. Swaminathan, B. Santhi
School of Computing, SASTRA University, Thanjavur 613401, India
Performance analysis of new word weighting procedures for opinion mining
G. R. Brindha, P. Swaminathan, B. Santhi
School of Computing, SASTRA University, Thanjavur 613401, India
 全文: PDF 
摘要: 概要:论坛和博客的普及为大量信息的处理带来了挑战和机遇。基于不同主题的信息通常包含了主观的定性词语,需要经过统计分析转换为可用的定量数据。这些数据如不恰当处理则会影响观点的正确表达。每个观点相关词的主要表义各有不同。为将词的语义转换为数据并加强对观点挖掘的分析,我们提出了一种新颖的加权方案,称为词权重推测法(inferred word weighting, IWW)。IWW通过对语境下和表义中词语重要性的计算对算法进行增强。相对已有的方法,本文提出的加权方法从分析的视角上为词语提供了合适的权重。此外,通过对包含停用词的文本分类的性能研究,提供了另一种校验方法,作为对所提出的新加权方法的补充。而通常这些停用词都会在文本处理时移除。将包含停用词这一新概念应用于本文提出的加权方法和已有加权方法,可观察到2个现象:(1)文本分类性能增强;(2)包含停用词与否,所造成的文本处理结果的差异在所提出的方法中较小,而在已有方法中较大。进而,从这2种现象得出推论。基于基准数据集的实验结果表明所提出的方法在分类精度上具有优化潜力。
关键词: 词权重推测法观点挖掘监督分类法支持向量机机器学习    
Abstract: The proliferation of forums and blogs leads to challenges and opportunities for processing large amounts of information. The information shared on various topics often contains opinionated words which are qualitative in nature. These qualitative words need statistical computations to convert them into useful quantitative data. This data should be processed properly since it expresses opinions. Each of these opinion bearing words differs based on the significant meaning it conveys. To process the linguistic meaning of words into data and to enhance opinion mining analysis, we propose a novel weighting scheme, referred to as inferred word weighting (IWW). IWW is computed based on the significance of the word in the document (SWD) and the significance of the word in the expression (SWE) to enhance their performance. The proposed weighting methods give an analytic view and provide appropriate weights to the words compared to existing methods. In addition to the new weighting methods, another type of checking is done on the performance of text classification by including stop-words. Generally, stop-words are removed in text processing. When this new concept of including stop-words is applied to the proposed and existing weighting methods, two facts are observed: (1) Classification performance is enhanced; (2) The outcome difference between inclusion and exclusion of stop-words is smaller in the proposed methods, and larger in existing methods. The inferences provided by these observations are discussed. Experimental results of the benchmark data sets show the potential enhancement in terms of classification accuracy.
Key words: Inferred word weight    Opinion mining    Supervised classification    Support vector machine (SVM)    Machine learning
收稿日期: 2015-08-30 出版日期: 2016-11-07
CLC:  TP391  
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
G. R. Brindha
P. Swaminathan
B. Santhi

引用本文:

G. R. Brindha, P. Swaminathan, B. Santhi. Performance analysis of new word weighting procedures for opinion mining. Front. Inform. Technol. Electron. Eng., 2016, 17(11): 1186-1198.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/10.1631/FITEE.1500283        http://www.zjujournals.com/xueshu/fitee/CN/Y2016/V17/I11/1186

[1] Jun-hong Zhang, Yu Liu. 应用完备集合固有时间尺度分解和混合差分进化和粒子群算法优化的最小二乘支持向量机对柴油机进行故障诊断[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(2): 272-286.
[2] Mohammad Mosleh, Hadi Latifpour, Mohammad Kheyrandish, Mahdi Mosleh, Najmeh Hosseinpour. 运用支持向量机的稳健智能音频水印设计[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(12): 1320-1330.
[3] Gurmanik Kaur, Ajat Shatru Arora, Vijender Kumar Jain. 基于体位特征使用混杂模型预测血压对于无支撑后背的反应[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(6): 474-485.
[4] Qi-rong Mao, Xin-yu Pan, Yong-zhao Zhan, Xiang-jun Shen. 基于Kinect的实时面部情感识别[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(4): 272-282.
[5] Ya-tao Zhang, Cheng-yu Liu, Shou-shui Wei, Chang-zhi Wei, Fei-fei Liu. 基于非线性支持向量机和遗传算法的移动ECG质量评估[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(7): 564-573.
[6] Kai-sheng Luo, Zheng Shi, Xiao-lang Yan, Zhen Geng. 基于支持向量机的反向光刻版图重定向算法[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(5): 390-400.
[7] Fei-wei Qin, Lu-ye Li, Shu-ming Gao, Xiao-ling Yang, Xiang Chen. 用于三维CAD模型分类的深度学习方法[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(2): 91-106.
[8] Li Chen, Ying-chun Yang, Zhao-hui Wu. 用于情感说话人识别的精细失真特征检测与修正[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(10): 903-916.