A new knowledge-enhanced graph convolutional neural network (KEGCN) classification model was proposed aiming at the problem of text classification. In the KEGCN model, firstly a text graph containing word nodes, document nodes, and external entity nodes was constructed on the entire text set. Different similarity calculation methods were used between different types of nodes. After the text graph was constructed, it was input into the two-layer graph convolutional network to learn the representation of the node and classified. The KEGCN model introduced external knowledge to compose the graph, and captured the long-distance discontinuous global semantic information, and was the first work to introduce knowledge information into the graph convolution network for classification tasks. Text classification experiments were conducted on four large-scale real data sets, 20NG, OHSUMED, R52 and R8, and results showed that the classification accuracy of the KEGCN network model was better than that of all baseline models. Results show that integrating knowledge information into the graph convolutional neural network is conducive to learning more accurate text representations and improving the accuracy of text classification.
Tab.2Comparison of classification accuracy of KEGCN model on four datasets
Fig.2Comparison of loss and accuracy of four models on OHSUMED and R52 datasets
Fig.3Comparison of ablation experiment results
Fig.4Performance comparison of different vector dimensions in OHSUMED dataset
Fig.5Performance comparison of different dimensions on R8 dataset
[1]
YAO L, MAO C, LUO Y. Graph convolutional networks for text classification[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu: AAAI, 2019, 33(1): 7370-7377.
[2]
ZHANG Y, JIN R, ZHOU Z H Understanding bag-of-words model: a statistical framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1: 43- 52
doi: 10.1007/s13042-010-0001-0
[3]
WANG S I, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Jeju Island: ACL, 2012: 90-94.
[4]
KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: EMNLP, 2014: 1746–1751.
[5]
LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. NewYork: IJCAI, 2016: 2873–2879.
[6]
COVER T, HART P Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13 (1): 21- 27
doi: 10.1109/TIT.1967.1053964
[7]
UTGOFF P E. ID5: an incremental ID3 [M]// Machine Learning Proceedings 1988. Ann Arbor: Morgan Kaufmann, 1988: 107 − 120.
[8]
LOH W Y Classification and regression trees[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2011, 1 (1): 14- 23
doi: 10.1002/widm.8
[9]
QUINLAN J R. C4.5: programs for machine learning[M]. Massachusetts: Morgan Kaufmann Publishers, 1993.
[10]
VATEEKUL P, KUBAT M. Fast induction of multiple decision trees in text categorization from large scale, imbalanced, and multi-label data[C]// 2009 IEEE International Conference on Data Mining Workshops. Miami, FL: IEEE, 2009: 320-325.
[11]
LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. NewYork: IJCAI, 2016: 2873-2879.
JOHNSON R, ZHANG T Semi-supervised convolutional neural networks for text categorization via region embedding[J]. Advances in Neural Information Processing Systems, 2015, 28: 919- 927
[14]
ZHAO Z, WU Y. Attention-based convolutional neural networks for sentence classification[C]// Proceeding of the 17th Annual Conference of the International Speech Communication Association. San Francisco: INTER SPEECH, 2016: 705-709.
[15]
XUE W, ZHOU W, LI T, et al. MTNA: a neural multi-task model for aspect category classification and aspect term extraction on restaurant reviews[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. Taipei: IJCNLP, 2017: 151-156.
[16]
ZHOU J, CUI G, HU S, et al Graph neural networks: a review of methods and applications[J]. AI Open, 2020, 1: 57- 81
doi: 10.1016/j.aiopen.2021.01.001
[17]
WU Z, PAN S, CHEN F, et al A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32 (1): 4- 24
[18]
HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1025-1035.
[19]
BI Z, ZHANG T, ZHOU P, et al Knowledge transfer for out-of-knowledge-base entities: improving graph-neural-network-based embedding using convolutional layers[J]. IEEE Access, 2020, 8: 159039- 159049
doi: 10.1109/ACCESS.2020.3019592
[20]
SCARSELLI F, GORI M, TSOI A C, et al The graph neural network model[J]. IEEE Transactions on Neural Networks, 2008, 20 (1): 61- 80
[21]
KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks. [EB/OL]. (2017-02-22). https://arxiv.org/abs/1609.02907.
[22]
BOJANOWSKI P, GRAVE E, JOULIN A, et al. Bag of tricks for efficient text classification[C]// Association for Computational Linguistics. Valencia : ACL, 2017: 427-431.
[23]
SHEN D, WANG G, WANG W, et al. Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms[EB/OL]. (2018-05-24). https://arxiv.org/abs/1805.09843.
[24]
RAGESH R, SELLAMANICKAM S, IYER A, et al. Hetegcn: heterogeneous graph convolutional networks for text classification[C]// Proceedings of the 14th ACM International Conference on Web Search and Data Mining. Queensland: WSDM, 2021: 860-868.
[25]
KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. (2014-12-22). https://arxiv.org/abs/1412.6980v3.