Please wait a minute...
浙江大学学报(工学版)  2019, Vol. 53 Issue (11): 2110-2117    DOI: 10.3785/j.issn.1008-973X.2019.11.008
计算机技术与控制工程     
基于深度双向分类器链的多标签新闻分类算法
胡天磊1(),王皓波1,尹文栋2
1. 浙江大学 计算机科学与技术学院,浙江 杭州 310027
2. 浙江大学 人文学院,浙江 杭州 310028
Multi-label news classification algorithm based on deep bi-directional classifier chains
Tian-lei HU1(),Hao-bo WANG1,Wen-dong YIN2
1. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
2. School of Humanities, Zhejiang University, Hangzhou 310028, China
 全文: PDF(1159 KB)   HTML
摘要:

在多标签新闻分类问题中,针对传统分类器链算法难以确定标签依赖顺序、集成模型运行效率低和无法应用复杂模型作为基分类器的问题,提出基于深度神经网络的双向分类器链算法. 该方法利用正向分类器链获取每个标签和前面所有标签的依赖关系,引入逆向分类器链,从正向链最后一个基分类器的输出开始反向学习每个标签和所有其他标签的相关性. 为了提取非线性标签相关性和提高预测性能,使用深度神经网络作为基分类器. 结合2条分类器链的均方误差,使用随机梯度下降算法对目标函数进行有效优化. 在多标签新闻分类数据集RCV1-v2上,将所提算法与当前主流的分类器链算法和其他多标签分类算法进行对比和分析. 实验结果表明,利用深度双向分类器链算法能够有效提升预测性能.

关键词: 多标签新闻分类深度学习神经网络分类器链    
Abstract:

A deep neural network based bi-directional classifier chains algorithm was proposed for multi-label news classification tasks to deal with problems faced by traditional classifier chains method, i.e. hard to determine the order of label dependencies, the inefficiency of integrated models and incapable of using complicated base classifiers. In the proposed method, a forward classifier chain is utilized to obtain the correlation between each label and all previous labels, and a backward classifier chain is involved, starting from the output of the last base classifier in the forward classifier chain, to learn the correlations between each label and all other labels. The deep neural network is employed as a base classifier in order to explore the non-linear label correlation and improve the predictive performance. Br integrating the mean square loss of the two classifier chains, the objective function is optimized by stochastic gradient descent algorithm. The experimental results of the proposed method for multi-label news classification dataset RCV1-v2 were compared with those of current classifier chains methods and other multi-label algorithms. Results show that the deep bi-directional classifier chains can significantly improve the predictive performance.

Key words: multi-label    news classification    deep learning    neural network    classifier chains
收稿日期: 2018-11-01 出版日期: 2019-11-21
CLC:  TP 181  
基金资助: 国家“973”重点基础研究发展规划资助项目(2015CB352400);国家自然科学基金资助项目(61672455,61472348);浙江省自然科学基金资助项目(LY18F020005)
作者简介: 胡天磊(1982—),男,副教授,博士,从事数据库、信息安全、机器学习研究. orcid.org/0000-0003-0744-6454. E-mail: htl@zju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
胡天磊
王皓波
尹文栋

引用本文:

胡天磊,王皓波,尹文栋. 基于深度双向分类器链的多标签新闻分类算法[J]. 浙江大学学报(工学版), 2019, 53(11): 2110-2117.

Tian-lei HU,Hao-bo WANG,Wen-dong YIN. Multi-label news classification algorithm based on deep bi-directional classifier chains. Journal of ZheJiang University (Engineering Science), 2019, 53(11): 2110-2117.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2019.11.008        http://www.zjujournals.com/eng/CN/Y2019/V53/I11/2110

图 1  深度神经网络结构
图 2  DBCC测试阶段结构图
方法 Micro-F1 Macro-F1 Example-F1
DBCC 0.491±0.010 0.270±0.009 0.487±0.003
Vanilla CC 0.456±0.008 0.216±0.013 0.461±0.009
DCC 0.459±0.015 0.267±0.010 0.471±0.008
CCE 0.454±0.002 0.249±0.005 0.437±0.003
C2AE 0.423±0.021 0.272±0.018 0.403±0.022
BR 0.419±0.011 0.219±0.019 0.397±0.016
表 1  3个F1度量的实验结果
图 3  不同方法的Precision@$ K$度量实验结果
图 4  学习率敏感性实验结果
1 MCCALLUM A, NIGAM K. A comparison of event models for naive Bayes text classification [C]// AAAI-98 Workshop on Learning for Text Categorization. Madison: AAAI, 1998, 752(1): 41-48.
2 DILRUKSHI I, DE Z K, CALDERA A. Twitter news classification using SVM [C]// International Conference on Computer Science and Education. Colombo: IEEE, 2013: 287-291.
3 KUMAR R B, KUMAR B S, PRASAD C S S Financial news classification using SVM[J]. International Journal of Scientific and Research Publications, 2012, 2 (3): 1- 6
4 SELAMAT A, OMATU S Web page feature selection and classification using neural networks[J]. Information Sciences, 2004, 158: 69- 88
doi: 10.1016/j.ins.2003.03.003
5 KIM Y. Convolutional neural networks for sentence classification [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014: 1746-1751.
6 KARPATHY A, TODERICI G, SHETTY S, et al. Large-scale video classification with convolutional neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 1725-1732.
7 WERMTER S Neural network agents for learning semantic text classification[J]. Information Retrieval, 2000, 3 (2): 87- 103
doi: 10.1023/A:1009942513170
8 BLEI D M, NG A Y, JORDAN M I Latent Dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993- 1022
9 涂鼎, 陈岭, 陈根才, 等 基于在线层次化非负矩阵分解的文本流主题检测[J]. 浙江大学学报: 工学版, 2016, 50 (8): 1618- 1626
TU Ding, CHEN Ling, CHEN Gen-cai, et al Hierarchical online NMF for detecting and tracking topic[J]. Journal of Zhejiang University: Engineering Science, 2016, 50 (8): 1618- 1626
10 林萌, 罗森林, 贾丛飞, 等 融合句义结构模型的微博话题摘要算法[J]. 浙江大学学报: 工学版, 2015, 49 (12): 2316- 2325
LIN Meng, LUO Sen-lin, JIA Cong-fei, et al Microblog topics summarization algorithm merging sentential structure model[J]. Journal of Zhejiang University: Engineering Science, 2015, 49 (12): 2316- 2325
11 HARRIS Z S Distributional structure[J]. Word, 1954, 10 (2/3): 146- 162
12 SALTON G, YU C T. On the construction of effective vocabularies for information retrieval [C]// ACM SIGIR Forum. Gaithersburg: ACM, 1973: 48-60.
13 BI W, KWOK J T. Multi-label classification on tree-and dag-structured hierarchies [C]// Proceedings of the 28th International Conference on Machine Learning. Bellevue: IMLS, 2011: 17-24.
14 ZHANG M L, ZHOU Z H ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40 (7): 2038- 2048
doi: 10.1016/j.patcog.2006.12.019
15 BRINKER K, HüLLERMEIER E. Case-based multilabel ranking [C]// IJCAI. Hyderabad: IJCAI, 2007: 702-707.
16 TSOUMAKAS G, KATAKIS I, VLAHAVAS I. Mining multi-label data [M]// Data Mining and Knowledge Discovery Handbook. Boston: Springer, 2009: 667-685.
17 HSU D J, KAKADE S M, LANGFORD J, et al. Multi-label prediction via compressed sensing [C]// Advances in Neural Information Processing Systems. Vancouver: NIPS, 2009: 772-780.
18 YEH C K, WU W C, KO W J, et al. Learning deep latent space for multi-label classification [C]// AAAI. San Francisco: AAAI, 2017: 2838-2844.
19 READ J, PFAHRINGER B, HOLMES G, et al Classifier chains for multi-label classification[J]. Machine Learning, 2011, 85 (3): 333
doi: 10.1007/s10994-011-5256-5
20 王少博, 李宇峰 用于多标记学习的分类器圈方法[J]. 软件学报, 2015, 26 (11): 2811- 2819
WANG Shao-bo, LI Yu-feng Classifier circle method for multi-label learning[J]. Journal of Software, 2015, 26 (11): 2811- 2819
21 KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks [C]// Advances in Neural Information Processing Systems. Lake Tahoe: NIPS, 2012: 1097-1105.
22 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
23 MCCULLOCH W S, PITTS W A logical calculus of the ideas immanent in nervous activity[J]. The Bulletin of Mathematical Biophysics, 1943, 5 (4): 115- 133
doi: 10.1007/BF02478259
24 ROSENBLATT F The perceptron: a probabilistic model for information storage and organization in the brain[J]. Psychological Review, 1958, 65 (6): 386
doi: 10.1037/h0042519
25 LECUN Y, BOTTOU L, BENGIO Y, et al Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86 (11): 2278- 2324
doi: 10.1109/5.726791
26 MIKOLOV T, KARAFIáT M, BURGET L, et al. Recurrent neural network based language model [C]// 11th Annual Conference of the International Speech Communication Association. Florence: ISCA, 2011: 2877-2880.
27 LEWIS D D, YANG Y, ROSE T G, et al Rcv1: a new benchmark collection for text categorization research[J]. Journal of Machine Learning Research, 2004, 5: 361- 397
28 ABADI M, BARHAM P, CHEN J, et al. Tensorflow: a system for large-scale machine learning [C]// OSDI. Savannah: USENIX, 2016, 16: 265-283.
[1] 许佳辉,王敬昌,陈岭,吴勇. 基于图神经网络的地表水水质预测模型[J]. 浙江大学学报(工学版), 2021, 55(4): 601-607.
[2] 王虹力,郭斌,刘思聪,刘佳琪,仵允港,於志文. 边端融合的终端情境自适应深度感知模型[J]. 浙江大学学报(工学版), 2021, 55(4): 626-638.
[3] 张腾,蒋鑫龙,陈益强,陈前,米涛免,陈彪. 基于腕部姿态的帕金森病用药后开-关期检测[J]. 浙江大学学报(工学版), 2021, 55(4): 639-647.
[4] 程鸿,胡佳杰,刘勇,叶远青. 强度传输方程和神经网络融合的三维重构算法[J]. 浙江大学学报(工学版), 2021, 55(4): 658-664.
[5] 郑英杰,吴松荣,韦若禹,涂振威,廖进,刘东. 基于目标图像FCM算法的地铁定位点匹配及误报排除方法[J]. 浙江大学学报(工学版), 2021, 55(3): 586-593.
[6] 毛奕喆,龚国芳,周星海,王飞. 基于马尔可夫过程和深度神经网络的TBM围岩识别[J]. 浙江大学学报(工学版), 2021, 55(3): 448-454.
[7] 刘芳,汪震,刘睿迪,王锴. 基于组合损失函数的BP神经网络风力发电短期预测方法[J]. 浙江大学学报(工学版), 2021, 55(3): 594-600.
[8] 徐利锋,黄海帆,丁维龙,范玉雷. 基于改进DenseNet的水果小目标检测[J]. 浙江大学学报(工学版), 2021, 55(2): 377-385.
[9] 陈世达,刘强,韩亮. 降低分布式训练通信的梯度稀疏压缩方法[J]. 浙江大学学报(工学版), 2021, 55(2): 386-394.
[10] 赵燕伟,张健,周仙明,吴耿育. 基于视觉-磁引导的无人机动态跟踪与精准着陆[J]. 浙江大学学报(工学版), 2021, 55(1): 96-108.
[11] 陈纬奇,王敬昌,陈岭,杨勇勤,吴勇. 基于深度神经网络的多因素感知终端换机预测模型[J]. 浙江大学学报(工学版), 2021, 55(1): 109-115.
[12] 许豪灿,李基拓,陆国栋. 由LeNet-5从单张着装图像重建三维人体[J]. 浙江大学学报(工学版), 2021, 55(1): 153-161.
[13] 黄毅鹏,胡冀苏,钱旭升,周志勇,赵文露,马麒,沈钧康,戴亚康. SE-Mask-RCNN:多参数MRI前列腺癌分割方法[J]. 浙江大学学报(工学版), 2021, 55(1): 203-212.
[14] 郑浦,白宏阳,李伟,郭宏伟. 复杂背景下的小目标检测算法[J]. 浙江大学学报(工学版), 2020, 54(9): 1777-1784.
[15] 陈巧红,陈翊,李文书,贾宇波. 多尺度SE-Xception服装图像分类[J]. 浙江大学学报(工学版), 2020, 54(9): 1727-1735.