Short text expansion and classification based on pseudo-relevance feedback" /> 基于伪相关反馈的短文本扩展与分类
Please wait a minute...
浙江大学学报(工学版)
优秀论文推荐     
基于伪相关反馈的短文本扩展与分类
王蒙, 林兰芬, 王锋
浙江大学 计算机科学与技术学院,浙江 杭州 310027
Short text expansion and classification based on pseudo-relevance feedback
WANG Meng, LIN Lan-fen, WANG Feng
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
 全文: PDF(1851 KB)   HTML
摘要:
针对短文本分类问题,提出基于伪相关反馈(PFR)的短文本扩展与分类方法.在保持语义不变的情况下,利用互联网中的相似语料对短文本的内容进行了扩展.对现有的仅使用局部特征的扩展语料特征抽取方法进行改进,引入全局特征抽取,将全局特征与局部特征相结合得到了更好的特征向量,有效地解决了分类过程中由短文本长度有限导致的特征矩阵高度稀疏的问题.通过在开放数据集上的测试和与其他文献的结果比对,验证了该方法在短文本分类的问题上可以取得较好的效果.
Abstract:

A novel classification method based on pseudo-relevance feedback (PFR) was proposed in order to solve the sparseness problems in short text classification. The short texts were expanded using the web pages which are similar to them in semantic level. The feature vector generation algorithm was modified to extract both the local features and the global features. The method can alleviate the sparseness problem of the final feature matrix, which is common in short text classification because of the limited length of the texts. The experimental results on an open dataset show that the method can significantly improve the short text classification effect compared with state-of-the-art methods.

出版日期: 2014-11-26
:  TP 391  
基金资助:

 博士点基金资助项目(20110101110065);国家“十二五”科技支撑计划资助项目(2012BAD35B01-3,2013BAF02B10).

通讯作者: 林兰芬,女,教授,博导.     E-mail: llf@zju.edu.cn
作者简介: 王蒙(1986 —),男,博士生,从事自然语言处理和数据挖掘的研究. E-mail: wangmeng@zju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  

引用本文:

王蒙, 林兰芬, 王锋. 基于伪相关反馈的短文本扩展与分类[J]. 浙江大学学报(工学版), 10.3785/j.issn.1008-973X.2014.10.000.

WANG Meng, LIN Lan-fen, WANG Feng.

Short text expansion and classification based on pseudo-relevance feedback
. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 10.3785/j.issn.1008-973X.2014.10.000.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2014.10.000        http://www.zjujournals.com/eng/CN/Y2014/V48/I5/2

[1] 张传强,洪慧,金红光. 聚光式太阳能热发电技术发展状况[J]. 热力发电, 2010, 39(12): 5-9.
ZHANG Chuan-qiang, HONG Hui, JIN Hong-guang. Development situation of power generation technology using heat of light-concentrating solar energy [J].Thermal Power Generation, 2010, 39(12): 5-9.
[2] 章国芳,朱天宇,王希晨. 塔式太阳能热发电技术进展及在我国的应用前景[J]. 太阳能, 2008, 29 (11): 33-37.
ZHANG Guo-fang, ZHU Tian-yu, WANG Xi-chen. Development of solar power tower technology and its prospect in China [J]. Solar Energy, 2008, 29 (11): 33-37.
[3] 高维,徐蕙,徐二树,等.塔式太阳能热发电吸热器运行安全性研究[J].中国电机工程学报, 2013, 33(2): 92-97.
GAO Wei, XU Hui, XU Er-shu, et al. Research on operation security of solar thermal tower power plant receiver [J]. Proceedings of the CSEE, 2013, 33(2): 92-97.
[4] GONZALEZ M M, PALAFOX H J, CLAUDIO E A. Numerical study of heat transfer by natural convection and surface thermal radiation in an open cavity receiver [J]. Solar Energy, 2012, 86(4): 1118-1128.
[5] GARBRECHT O, SIBAI A F, KNEER R, et al. CFD-simulation of a new receiver design for a molten salt solar power tower [J]. Solar Energy, 2013, 90(1): 94-106.
[6] XU E, YU Q, WANG Z, et al. Modeling and simulation of 1 MW DAHAN solar thermal power tower plant [J]. Renewable Energy, 2011, 36(2): 848-857.
[7] 方嘉宾,魏进家,董训伟,等. 腔式太阳能吸热器热性能的模拟计算[J]. 工程热物理学报, 2009, 30(3): 428432.
FANG Jia-bin,WEI Jin-jia,DONG Xun-wei,et al. Performance simulation of solar cavity receiver [J]. Journal of Engineering Thermophysics, 2009, 30(3): 428-432.
[8] LU Jian-feng, DING Jing, YANG Jian-ping. Heat transfer performance of an external receiver pipe under unilateral concentrated solar radiation [J]. Solar Energy, 2010, 84(11): 1879-1887.
[9] 杨敏林,杨晓西,丁静,等. 半周加热半周绝热的熔盐吸热管传热特性研究[J]. 太阳能学报, 2009, 30(8): 10071012.
YANG Min-lin,YANG Xiao-xi,DING Jing,et al. Heat transfer research on molten salt receiver with semi-circumference heat [J]. Acta Energiae Solaris Sinica, 2009, 30(8): 1007-1012.
[10] 杜景龙,唐大伟,李铁. 5kW聚光型太阳模拟器加热特性的实验研究[J]. 太阳能学报, 2012, 33(4): 625-629.
DU Jing-long,TANG Da-wei,LI Tie. Experiment study of the heating characteristics of 5kW focused solar simulator [J]. Acta Energiae Solaris Sinica, 2012, 33(4): 625-629.
[11] 刘志刚,张春平,赵耀华,等. 一种新型腔式吸热器的设计与实验研究[J]. 太阳能学报, 2005, 26(3): 3843.
LIU Zhi-gang, ZHANG Chun-ping, ZHAO Yao-hua, et al. The design and experiments of a new cavity absorber [J]. Acta Energiae Solaris Sinica, 2005, 26(3): 38-43.
[12] PRAKASH M, KEDARE S B, NAYAK J K. Investigations on heat losses from a solar cavity receiver [J]. Solar Energy, 2009, 83(2): 157-170.
[13] WU S, XIAO L, CAO Y, et al. Convection heat loss from cavity receiver in parabolic dish solar thermal power system: a review [J]. Solar Energy, 2010, 84(8): 1342-1355.
[14] SIEBERS D L, KRAABEL J S. Estimating convective energy losses from solar central receivers [R]. Livermore: Sandia National Labs, 1984.
[15] LI X, KONG W, WANG Z, et al. Thermal model and thermodynamic performance of molten salt cavity receiver [J]. Renewable Energy, 2010, 35(5): 981-988.
[16] XIAO G, GUO K, LUO Z, et al. Simulation and experimental study on a spiral solid particle solar receiver [J]. Applied Energy, 2014, 113(01): 178188.
[17] REDDY K S, KUMAR S N. Combined laminar natural convection and surface radiation heat transfer in a modified cavity receiver of solar parabolic dish [J]. International Journal of Thermal Sciences, 2008, 47(12): 1647-1657.
[1] 何雪军, 王进, 陆国栋, 刘振宇, 陈立, 金晶. 基于三角网切片及碰撞检测的工业机器人三维头像雕刻[J]. 浙江大学学报(工学版), 2017, 51(6): 1104-1110.
[2] 王桦, 韩同阳, 周可. 公安情报中基于关键图谱的群体发现算法[J]. 浙江大学学报(工学版), 2017, 51(6): 1173-1180.
[3] 尤海辉, 马增益, 唐义军, 王月兰, 郑林, 俞钟, 吉澄军. 循环流化床入炉垃圾热值软测量[J]. 浙江大学学报(工学版), 2017, 51(6): 1163-1172.
[4] 毕晓君, 王佳荟. 基于混合学习策略的教与学优化算法[J]. 浙江大学学报(工学版), 2017, 51(5): 1024-1031.
[5] 黄正宇, 蒋鑫龙, 刘军发, 陈益强, 谷洋. 基于融合特征的半监督流形约束定位方法[J]. 浙江大学学报(工学版), 2017, 51(4): 655-662.
[6] 蒋鑫龙, 陈益强, 刘军发, 忽丽莎, 沈建飞. 面向自闭症患者社交距离认知的可穿戴系统[J]. 浙江大学学报(工学版), 2017, 51(4): 637-647.
[7] 王亮, 於志文, 郭斌. 基于双层多粒度知识发现的移动轨迹预测模型[J]. 浙江大学学报(工学版), 2017, 51(4): 669-674.
[8] 廖苗, 赵于前, 曾业战, 黄忠朝, 张丙奎, 邹北骥. 基于支持向量机和椭圆拟合的细胞图像自动分割[J]. 浙江大学学报(工学版), 2017, 51(4): 722-728.
[9] 穆晶晶, 赵昕玥, 何再兴, 张树有. 基于凹凸变换与圆周拟合的重叠气泡轮廓重构[J]. 浙江大学学报(工学版), 2017, 51(4): 714-721.
[10] 戴彩艳, 陈崚, 李斌, 陈伯伦. 复杂网络中的抽样链接预测[J]. 浙江大学学报(工学版), 2017, 51(3): 554-561.
[11] 刘磊, 杨鹏, 刘作军. 采用多核相关向量机的人体步态识别[J]. 浙江大学学报(工学版), 2017, 51(3): 562-571.
[12] 郭梦丽, 达飞鹏, 邓星, 盖绍彦. 基于关键点和局部特征的三维人脸识别[J]. 浙江大学学报(工学版), 2017, 51(3): 584-589.
[13] 王海军, 葛红娟, 张圣燕. 基于核协同表示的快速目标跟踪算法[J]. 浙江大学学报(工学版), 2017, 51(2): 399-407.
[14] 张亚楠, 陈德运, 王莹洁, 刘宇鹏. 基于增量图形模式匹配的动态冷启动推荐方法[J]. 浙江大学学报(工学版), 2017, 51(2): 408-415.
[15] 刘宇鹏, 乔秀明, 赵石磊, 马春光. 统计机器翻译中大规模特征的深度融合[J]. 浙江大学学报(工学版), 2017, 51(1): 46-56.