Please wait a minute...
浙江大学学报(工学版)  2022, Vol. 56 Issue (2): 322-328    DOI: 10.3785/j.issn.1008-973X.2022.02.013
计算机与控制工程     
基于知识增强的图卷积神经网络的文本分类
王婷(),朱小飞*(),唐顾
重庆理工大学 计算机科学与工程学院,重庆 400054
Knowledge-enhanced graph convolutional neural networks for text classification
Ting WANG(),Xiao-fei ZHU*(),Gu TANG
College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, China
 全文: PDF(1936 KB)   HTML
摘要:

针对文本分类问题,提出新的基于知识增强的图卷积神经网络(KEGCN)分类模型. KEGCN模型在整个文本集上构建了一个包含单词节点、文档节点、外部实体节点的文本图,不同类型节点之间使用不同的相似性计算方法;在文本图构建完成后将其输入到2层图卷积网络中学习节点的表示并进行分类. KEGCN模型引入外部知识进行构图,捕获长距离不连续的全局语义信息,是第1个将知识信息引入图卷积网络进行分类任务的工作. 在4个大规模真实数据集20NG、OHSUMED、R52、R8上进行文本分类实验,结果表明,KEGCN模型的分类准确率优于所有的基线模型. 将知识信息融入图卷积神经网络有利于学习到更精准的文本表示,提高文本分类的准确率.

关键词: 知识嵌入图卷积网络神经网络文本分类自然语言处理    
Abstract:

A new knowledge-enhanced graph convolutional neural network (KEGCN) classification model was proposed aiming at the problem of text classification. In the KEGCN model, firstly a text graph containing word nodes, document nodes, and external entity nodes was constructed on the entire text set. Different similarity calculation methods were used between different types of nodes. After the text graph was constructed, it was input into the two-layer graph convolutional network to learn the representation of the node and classified. The KEGCN model introduced external knowledge to compose the graph, and captured the long-distance discontinuous global semantic information, and was the first work to introduce knowledge information into the graph convolution network for classification tasks. Text classification experiments were conducted on four large-scale real data sets, 20NG, OHSUMED, R52 and R8, and results showed that the classification accuracy of the KEGCN network model was better than that of all baseline models. Results show that integrating knowledge information into the graph convolutional neural network is conducive to learning more accurate text representations and improving the accuracy of text classification.

Key words: knowledge embedding    graph convolutional network    neural network    text classification    natural language processing
收稿日期: 2021-07-12 出版日期: 2022-03-03
CLC:  TP 391.1  
基金资助: 国家自然科学基金资助项目(62141201);重庆市技术创新与应用发展专项项目(cstc2020jscx-dxwtBX0014);重庆市教委语言文字科研资助项目(yyk20103)
通讯作者: 朱小飞     E-mail: cstingwang2021@163.com;zxf@cqut.edu.cn
作者简介: 王婷(1997—),女,硕士生,从事文本分类研究. orcid.org/0000-0003-0318-141X. E-mail: cstingwang2021@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
王婷
朱小飞
唐顾

引用本文:

王婷,朱小飞,唐顾. 基于知识增强的图卷积神经网络的文本分类[J]. 浙江大学学报(工学版), 2022, 56(2): 322-328.

Ting WANG,Xiao-fei ZHU,Gu TANG. Knowledge-enhanced graph convolutional neural networks for text classification. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 322-328.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.02.013        https://www.zjujournals.com/eng/CN/Y2022/V56/I2/322

图 1  KEGCN 结构图
数据集 训练集文档数 测试集文档数 类别数 实体数 平均长度
20NG 11314 7532 20 26607 221.26
OHSUMED 3357 4043 23 9075 135.82
R52 6532 2568 52 7476 69.82
R8 5485 2189 8 6440 65.72
表 1  情感分析数据集统计
模型 Acc
20NG OHSUMED R52 R8
CNN-rand 0.7693±0.0061 0.4387±0.0100 0.8537±0.0047 0.9402±0.0057
CNN-non-static 0.8215±0.0052 0.5844±0.0106 0.8759±0.0048 0.9571±0.0052
LSTM(pretrain) 0.7543±0.0172 0.5110±0.0150 0.9048±0.0086 0.9609±0.0019
Bi-LSTM 0.7318±0.0185 0.4927±0.0107 0.9054±0.0091 0.9631±0.0033
fastText 0.7938±0.0030 0.5770±0.0049 0.9281±0.0009 0.9613±0.0021
SWEM 0.8516±0.0029 0.6312±0.0055 0.9294±0.0024 0.9532±0.0026
Text-GCN 0.8634±0.0009 0.6836±0.0056 0.9356±0.0018 0.9707±0.0051
HETE-GCN 0.8715±0.0015 0.6811±0.0070 0.9435±0.0025 0.9724±0.0010
KEGCN 0.8822±0.0045 0.6971±0.0059 0.9451±0.0018 0.9741±0.0025
表 2  KEGCN模型在4个数据集上的分类准确性对比
图 2  4种模型在OHSUMED和R52数据集上损失和准确率的对比
图 3  消融实验结果对比
图 4  OHSUMED数据集中不同向量维度下的性能对比
图 5  R8数据集中不同特征向量维度下的性能对比
1 YAO L, MAO C, LUO Y. Graph convolutional networks for text classification[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu: AAAI, 2019, 33(1): 7370-7377.
2 ZHANG Y, JIN R, ZHOU Z H Understanding bag-of-words model: a statistical framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1: 43- 52
doi: 10.1007/s13042-010-0001-0
3 WANG S I, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Jeju Island: ACL, 2012: 90-94.
4 KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: EMNLP, 2014: 1746–1751.
5 LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. NewYork: IJCAI, 2016: 2873–2879.
6 COVER T, HART P Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13 (1): 21- 27
doi: 10.1109/TIT.1967.1053964
7 UTGOFF P E. ID5: an incremental ID3 [M]// Machine Learning Proceedings 1988. Ann Arbor: Morgan Kaufmann, 1988: 107 − 120.
8 LOH W Y Classification and regression trees[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2011, 1 (1): 14- 23
doi: 10.1002/widm.8
9 QUINLAN J R. C4.5: programs for machine learning[M]. Massachusetts: Morgan Kaufmann Publishers, 1993.
10 VATEEKUL P, KUBAT M. Fast induction of multiple decision trees in text categorization from large scale, imbalanced, and multi-label data[C]// 2009 IEEE International Conference on Data Mining Workshops. Miami, FL: IEEE, 2009: 320-325.
11 LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. NewYork: IJCAI, 2016: 2873-2879.
12 HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural Computation, 1997, 9 (8): 1735- 1780
doi: 10.1162/neco.1997.9.8.1735
13 JOHNSON R, ZHANG T Semi-supervised convolutional neural networks for text categorization via region embedding[J]. Advances in Neural Information Processing Systems, 2015, 28: 919- 927
14 ZHAO Z, WU Y. Attention-based convolutional neural networks for sentence classification[C]// Proceeding of the 17th Annual Conference of the International Speech Communication Association. San Francisco: INTER SPEECH, 2016: 705-709.
15 XUE W, ZHOU W, LI T, et al. MTNA: a neural multi-task model for aspect category classification and aspect term extraction on restaurant reviews[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. Taipei: IJCNLP, 2017: 151-156.
16 ZHOU J, CUI G, HU S, et al Graph neural networks: a review of methods and applications[J]. AI Open, 2020, 1: 57- 81
doi: 10.1016/j.aiopen.2021.01.001
17 WU Z, PAN S, CHEN F, et al A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32 (1): 4- 24
18 HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1025-1035.
19 BI Z, ZHANG T, ZHOU P, et al Knowledge transfer for out-of-knowledge-base entities: improving graph-neural-network-based embedding using convolutional layers[J]. IEEE Access, 2020, 8: 159039- 159049
doi: 10.1109/ACCESS.2020.3019592
20 SCARSELLI F, GORI M, TSOI A C, et al The graph neural network model[J]. IEEE Transactions on Neural Networks, 2008, 20 (1): 61- 80
21 KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks. [EB/OL]. (2017-02-22). https://arxiv.org/abs/1609.02907.
22 BOJANOWSKI P, GRAVE E, JOULIN A, et al. Bag of tricks for efficient text classification[C]// Association for Computational Linguistics. Valencia : ACL, 2017: 427-431.
23 SHEN D, WANG G, WANG W, et al. Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms[EB/OL]. (2018-05-24). https://arxiv.org/abs/1805.09843.
24 RAGESH R, SELLAMANICKAM S, IYER A, et al. Hetegcn: heterogeneous graph convolutional networks for text classification[C]// Proceedings of the 14th ACM International Conference on Web Search and Data Mining. Queensland: WSDM, 2021: 860-868.
25 KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. (2014-12-22). https://arxiv.org/abs/1412.6980v3.
[1] 何立,庞善民. 结合年龄监督和人脸先验的语音-人脸图像重建[J]. 浙江大学学报(工学版), 2022, 56(5): 1006-1016.
[2] 王友卫,童爽,凤丽洲,朱建明,李洋,陈福. 基于图卷积网络的归纳式微博谣言检测新方法[J]. 浙江大学学报(工学版), 2022, 56(5): 956-966.
[3] 王云灏,孙铭会,辛毅,张博宣. 基于压电薄膜传感器的机器人触觉识别系统[J]. 浙江大学学报(工学版), 2022, 56(4): 702-710.
[4] 吴泽康,赵姗,李宏伟,姜懿芮. 遥感图像语义分割空间全局上下文信息网络[J]. 浙江大学学报(工学版), 2022, 56(4): 795-802.
[5] 陈扬钊,袁伟娜. 深度学习辅助上行免调度NOMA多用户检测方法[J]. 浙江大学学报(工学版), 2022, 56(4): 816-822.
[6] 张科文,潘柏松. 考虑非线性模型不确定性的航天器自主交会控制[J]. 浙江大学学报(工学版), 2022, 56(4): 833-842.
[7] 程若然,赵晓丽,周浩军,叶翰辰. 基于深度学习的中文字体风格转换研究综述[J]. 浙江大学学报(工学版), 2022, 56(3): 510-519, 530.
[8] 温佩芝,陈君谋,肖雁南,温雅媛,黄文明. 基于生成式对抗网络和多级小波包卷积网络的水下图像增强算法[J]. 浙江大学学报(工学版), 2022, 56(2): 213-224.
[9] 李广龙,申德荣,聂铁铮,寇月. 数据库外基于多模型的学习式查询优化方法[J]. 浙江大学学报(工学版), 2022, 56(2): 288-296.
[10] 任松,朱倩雯,涂歆玥,邓超,王小书. 基于深度学习的公路隧道衬砌病害识别方法[J]. 浙江大学学报(工学版), 2022, 56(1): 92-99.
[11] 黄发明,潘李含,姚池,周创兵,姜清辉,常志璐. 基于半监督机器学习的滑坡易发性预测建模[J]. 浙江大学学报(工学版), 2021, 55(9): 1705-1713.
[12] 张楠,董红召,佘翊妮. 公交专用道条件下公交车辆轨迹的Seq2Seq预测[J]. 浙江大学学报(工学版), 2021, 55(8): 1482-1489.
[13] 王飞,徐维祥. 基于LSTM神经网络改进的路阻函数模型[J]. 浙江大学学报(工学版), 2021, 55(6): 1065-1071.
[14] 郝晏,丁雅斌,付津昇. 机器人加工系统累积误差逐级闭环优化策略[J]. 浙江大学学报(工学版), 2021, 55(6): 1142-1149.
[15] 许佳辉,王敬昌,陈岭,吴勇. 基于图神经网络的地表水水质预测模型[J]. 浙江大学学报(工学版), 2021, 55(4): 601-607.