Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2022, Vol. 56 Issue (2): 322-328    DOI: 10.3785/j.issn.1008-973X.2022.02.013
    
Knowledge-enhanced graph convolutional neural networks for text classification
Ting WANG(),Xiao-fei ZHU*(),Gu TANG
College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, China
Download: HTML     PDF(1936KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A new knowledge-enhanced graph convolutional neural network (KEGCN) classification model was proposed aiming at the problem of text classification. In the KEGCN model, firstly a text graph containing word nodes, document nodes, and external entity nodes was constructed on the entire text set. Different similarity calculation methods were used between different types of nodes. After the text graph was constructed, it was input into the two-layer graph convolutional network to learn the representation of the node and classified. The KEGCN model introduced external knowledge to compose the graph, and captured the long-distance discontinuous global semantic information, and was the first work to introduce knowledge information into the graph convolution network for classification tasks. Text classification experiments were conducted on four large-scale real data sets, 20NG, OHSUMED, R52 and R8, and results showed that the classification accuracy of the KEGCN network model was better than that of all baseline models. Results show that integrating knowledge information into the graph convolutional neural network is conducive to learning more accurate text representations and improving the accuracy of text classification.



Key wordsknowledge embedding      graph convolutional network      neural network      text classification      natural language processing     
Received: 12 July 2021      Published: 03 March 2022
CLC:  TP 391.1  
Corresponding Authors: Xiao-fei ZHU     E-mail: cstingwang2021@163.com;zxf@cqut.edu.cn
Cite this article:

Ting WANG,Xiao-fei ZHU,Gu TANG. Knowledge-enhanced graph convolutional neural networks for text classification. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 322-328.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.02.013     OR     https://www.zjujournals.com/eng/Y2022/V56/I2/322


基于知识增强的图卷积神经网络的文本分类

针对文本分类问题,提出新的基于知识增强的图卷积神经网络(KEGCN)分类模型. KEGCN模型在整个文本集上构建了一个包含单词节点、文档节点、外部实体节点的文本图,不同类型节点之间使用不同的相似性计算方法;在文本图构建完成后将其输入到2层图卷积网络中学习节点的表示并进行分类. KEGCN模型引入外部知识进行构图,捕获长距离不连续的全局语义信息,是第1个将知识信息引入图卷积网络进行分类任务的工作. 在4个大规模真实数据集20NG、OHSUMED、R52、R8上进行文本分类实验,结果表明,KEGCN模型的分类准确率优于所有的基线模型. 将知识信息融入图卷积神经网络有利于学习到更精准的文本表示,提高文本分类的准确率.


关键词: 知识嵌入,  图卷积网络,  神经网络,  文本分类,  自然语言处理 
Fig.1 Illustration of construction of KEGCN
数据集 训练集文档数 测试集文档数 类别数 实体数 平均长度
20NG 11314 7532 20 26607 221.26
OHSUMED 3357 4043 23 9075 135.82
R52 6532 2568 52 7476 69.82
R8 5485 2189 8 6440 65.72
Tab.1 Sentiment analysis dataset statistics
模型 Acc
20NG OHSUMED R52 R8
CNN-rand 0.7693±0.0061 0.4387±0.0100 0.8537±0.0047 0.9402±0.0057
CNN-non-static 0.8215±0.0052 0.5844±0.0106 0.8759±0.0048 0.9571±0.0052
LSTM(pretrain) 0.7543±0.0172 0.5110±0.0150 0.9048±0.0086 0.9609±0.0019
Bi-LSTM 0.7318±0.0185 0.4927±0.0107 0.9054±0.0091 0.9631±0.0033
fastText 0.7938±0.0030 0.5770±0.0049 0.9281±0.0009 0.9613±0.0021
SWEM 0.8516±0.0029 0.6312±0.0055 0.9294±0.0024 0.9532±0.0026
Text-GCN 0.8634±0.0009 0.6836±0.0056 0.9356±0.0018 0.9707±0.0051
HETE-GCN 0.8715±0.0015 0.6811±0.0070 0.9435±0.0025 0.9724±0.0010
KEGCN 0.8822±0.0045 0.6971±0.0059 0.9451±0.0018 0.9741±0.0025
Tab.2 Comparison of classification accuracy of KEGCN model on four datasets
Fig.2 Comparison of loss and accuracy of four models on OHSUMED and R52 datasets
Fig.3 Comparison of ablation experiment results
Fig.4 Performance comparison of different vector dimensions in OHSUMED dataset
Fig.5 Performance comparison of different dimensions on R8 dataset
[1]   YAO L, MAO C, LUO Y. Graph convolutional networks for text classification[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu: AAAI, 2019, 33(1): 7370-7377.
[2]   ZHANG Y, JIN R, ZHOU Z H Understanding bag-of-words model: a statistical framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1: 43- 52
doi: 10.1007/s13042-010-0001-0
[3]   WANG S I, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Jeju Island: ACL, 2012: 90-94.
[4]   KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: EMNLP, 2014: 1746–1751.
[5]   LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. NewYork: IJCAI, 2016: 2873–2879.
[6]   COVER T, HART P Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13 (1): 21- 27
doi: 10.1109/TIT.1967.1053964
[7]   UTGOFF P E. ID5: an incremental ID3 [M]// Machine Learning Proceedings 1988. Ann Arbor: Morgan Kaufmann, 1988: 107 − 120.
[8]   LOH W Y Classification and regression trees[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2011, 1 (1): 14- 23
doi: 10.1002/widm.8
[9]   QUINLAN J R. C4.5: programs for machine learning[M]. Massachusetts: Morgan Kaufmann Publishers, 1993.
[10]   VATEEKUL P, KUBAT M. Fast induction of multiple decision trees in text categorization from large scale, imbalanced, and multi-label data[C]// 2009 IEEE International Conference on Data Mining Workshops. Miami, FL: IEEE, 2009: 320-325.
[11]   LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. NewYork: IJCAI, 2016: 2873-2879.
[12]   HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural Computation, 1997, 9 (8): 1735- 1780
doi: 10.1162/neco.1997.9.8.1735
[13]   JOHNSON R, ZHANG T Semi-supervised convolutional neural networks for text categorization via region embedding[J]. Advances in Neural Information Processing Systems, 2015, 28: 919- 927
[14]   ZHAO Z, WU Y. Attention-based convolutional neural networks for sentence classification[C]// Proceeding of the 17th Annual Conference of the International Speech Communication Association. San Francisco: INTER SPEECH, 2016: 705-709.
[15]   XUE W, ZHOU W, LI T, et al. MTNA: a neural multi-task model for aspect category classification and aspect term extraction on restaurant reviews[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. Taipei: IJCNLP, 2017: 151-156.
[16]   ZHOU J, CUI G, HU S, et al Graph neural networks: a review of methods and applications[J]. AI Open, 2020, 1: 57- 81
doi: 10.1016/j.aiopen.2021.01.001
[17]   WU Z, PAN S, CHEN F, et al A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32 (1): 4- 24
[18]   HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1025-1035.
[19]   BI Z, ZHANG T, ZHOU P, et al Knowledge transfer for out-of-knowledge-base entities: improving graph-neural-network-based embedding using convolutional layers[J]. IEEE Access, 2020, 8: 159039- 159049
doi: 10.1109/ACCESS.2020.3019592
[20]   SCARSELLI F, GORI M, TSOI A C, et al The graph neural network model[J]. IEEE Transactions on Neural Networks, 2008, 20 (1): 61- 80
[21]   KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks. [EB/OL]. (2017-02-22). https://arxiv.org/abs/1609.02907.
[22]   BOJANOWSKI P, GRAVE E, JOULIN A, et al. Bag of tricks for efficient text classification[C]// Association for Computational Linguistics. Valencia : ACL, 2017: 427-431.
[23]   SHEN D, WANG G, WANG W, et al. Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms[EB/OL]. (2018-05-24). https://arxiv.org/abs/1805.09843.
[24]   RAGESH R, SELLAMANICKAM S, IYER A, et al. Hetegcn: heterogeneous graph convolutional networks for text classification[C]// Proceedings of the 14th ACM International Conference on Web Search and Data Mining. Queensland: WSDM, 2021: 860-868.
[25]   KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. (2014-12-22). https://arxiv.org/abs/1412.6980v3.
[1] Li HE,Shan-min PANG. Face reconstruction from voice based on age-supervised learning and face prior information[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 1006-1016.
[2] You-wei WANG,Shuang TONG,Li-zhou FENG,Jian-ming ZHU,Yang LI,Fu CHEN. New inductive microblog rumor detection method based on graph convolutional network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 956-966.
[3] Yun-hao WANG,Ming-hui SUN,Yi XIN,Bo-xuan ZHANG. Robot tactile recognition system based on piezoelectric film sensor[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 702-710.
[4] Ze-kang WU,Shan ZHAO,Hong-wei LI,Yi-rui JIANG. Spatial global context information network for semantic segmentation of remote sensing image[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 795-802.
[5] Yang-zhao CHEN,Wei-na YUAN. Deep learning aided multi-user detection for up-link grant-free NOMA[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 816-822.
[6] Ke-wen ZHANG,Bai-song PAN. Control design of spacecraft autonomous rendezvous using nonlinear models with uncertainty[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 833-842.
[7] Ruo-ran CHENG,Xiao-li ZHAO,Hao-jun ZHOU,Han-chen YE. Review of Chinese font style transfer research based on deep learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 510-519, 530.
[8] Pei-zhi WEN,Jun-mou CHEN,Yan-nan XIAO,Ya-yuan WEN,Wen-ming HUANG. Underwater image enhancement algorithm based on GAN and multi-level wavelet CNN[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 213-224.
[9] Guang-long LI,De-rong SHEN,Tie-zheng NIE,Yue KOU. Learning query optimization method based on multi model outside database[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 288-296.
[10] Song REN,Qian-wen ZHU,Xin-yue TU,Chao DENG,Xiao-shu WANG. Lining disease identification of highway tunnel based on deep learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 92-99.
[11] Fa-ming HUANG,Li-han PAN,Chi YAO,Chuang-bing ZHOU,Qing-hui JIANG,Zhi-lu CHANG. Landslide susceptibility prediction modelling based on semi-supervised machine learning[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(9): 1705-1713.
[12] Nan ZHANG,Hong-zhao DONG,Yi-ni SHE. Seq2Seq prediction of bus trajectory on exclusive bus lanes[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(8): 1482-1489.
[13] Fei WANG,Wei-xiang XU. Improved model of road impedance function based on LSTM neural network[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(6): 1065-1071.
[14] Yan HAO,Ya-bin DING,Jin-sheng FU. Hierarchical closed-loop optimization strategy for cumulative error of robot machining system[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(6): 1142-1149.
[15] Jia-hui XU,Jing-chang WANG,Ling CHEN,Yong WU. Surface water quality prediction model based on graph neural network[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 601-607.