|
|
Knowledge-enhanced graph convolutional neural networks for text classification |
Ting WANG( ),Xiao-fei ZHU*( ),Gu TANG |
College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, China |
|
|
Abstract A new knowledge-enhanced graph convolutional neural network (KEGCN) classification model was proposed aiming at the problem of text classification. In the KEGCN model, firstly a text graph containing word nodes, document nodes, and external entity nodes was constructed on the entire text set. Different similarity calculation methods were used between different types of nodes. After the text graph was constructed, it was input into the two-layer graph convolutional network to learn the representation of the node and classified. The KEGCN model introduced external knowledge to compose the graph, and captured the long-distance discontinuous global semantic information, and was the first work to introduce knowledge information into the graph convolution network for classification tasks. Text classification experiments were conducted on four large-scale real data sets, 20NG, OHSUMED, R52 and R8, and results showed that the classification accuracy of the KEGCN network model was better than that of all baseline models. Results show that integrating knowledge information into the graph convolutional neural network is conducive to learning more accurate text representations and improving the accuracy of text classification.
|
Received: 12 July 2021
Published: 03 March 2022
|
|
Corresponding Authors:
Xiao-fei ZHU
E-mail: cstingwang2021@163.com;zxf@cqut.edu.cn
|
基于知识增强的图卷积神经网络的文本分类
针对文本分类问题,提出新的基于知识增强的图卷积神经网络(KEGCN)分类模型. KEGCN模型在整个文本集上构建了一个包含单词节点、文档节点、外部实体节点的文本图,不同类型节点之间使用不同的相似性计算方法;在文本图构建完成后将其输入到2层图卷积网络中学习节点的表示并进行分类. KEGCN模型引入外部知识进行构图,捕获长距离不连续的全局语义信息,是第1个将知识信息引入图卷积网络进行分类任务的工作. 在4个大规模真实数据集20NG、OHSUMED、R52、R8上进行文本分类实验,结果表明,KEGCN模型的分类准确率优于所有的基线模型. 将知识信息融入图卷积神经网络有利于学习到更精准的文本表示,提高文本分类的准确率.
关键词:
知识嵌入,
图卷积网络,
神经网络,
文本分类,
自然语言处理
|
|
[1] |
YAO L, MAO C, LUO Y. Graph convolutional networks for text classification[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu: AAAI, 2019, 33(1): 7370-7377.
|
|
|
[2] |
ZHANG Y, JIN R, ZHOU Z H Understanding bag-of-words model: a statistical framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1: 43- 52
doi: 10.1007/s13042-010-0001-0
|
|
|
[3] |
WANG S I, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Jeju Island: ACL, 2012: 90-94.
|
|
|
[4] |
KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: EMNLP, 2014: 1746–1751.
|
|
|
[5] |
LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. NewYork: IJCAI, 2016: 2873–2879.
|
|
|
[6] |
COVER T, HART P Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13 (1): 21- 27
doi: 10.1109/TIT.1967.1053964
|
|
|
[7] |
UTGOFF P E. ID5: an incremental ID3 [M]// Machine Learning Proceedings 1988. Ann Arbor: Morgan Kaufmann, 1988: 107 − 120.
|
|
|
[8] |
LOH W Y Classification and regression trees[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2011, 1 (1): 14- 23
doi: 10.1002/widm.8
|
|
|
[9] |
QUINLAN J R. C4.5: programs for machine learning[M]. Massachusetts: Morgan Kaufmann Publishers, 1993.
|
|
|
[10] |
VATEEKUL P, KUBAT M. Fast induction of multiple decision trees in text categorization from large scale, imbalanced, and multi-label data[C]// 2009 IEEE International Conference on Data Mining Workshops. Miami, FL: IEEE, 2009: 320-325.
|
|
|
[11] |
LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. NewYork: IJCAI, 2016: 2873-2879.
|
|
|
[12] |
HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural Computation, 1997, 9 (8): 1735- 1780
doi: 10.1162/neco.1997.9.8.1735
|
|
|
[13] |
JOHNSON R, ZHANG T Semi-supervised convolutional neural networks for text categorization via region embedding[J]. Advances in Neural Information Processing Systems, 2015, 28: 919- 927
|
|
|
[14] |
ZHAO Z, WU Y. Attention-based convolutional neural networks for sentence classification[C]// Proceeding of the 17th Annual Conference of the International Speech Communication Association. San Francisco: INTER SPEECH, 2016: 705-709.
|
|
|
[15] |
XUE W, ZHOU W, LI T, et al. MTNA: a neural multi-task model for aspect category classification and aspect term extraction on restaurant reviews[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. Taipei: IJCNLP, 2017: 151-156.
|
|
|
[16] |
ZHOU J, CUI G, HU S, et al Graph neural networks: a review of methods and applications[J]. AI Open, 2020, 1: 57- 81
doi: 10.1016/j.aiopen.2021.01.001
|
|
|
[17] |
WU Z, PAN S, CHEN F, et al A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32 (1): 4- 24
|
|
|
[18] |
HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1025-1035.
|
|
|
[19] |
BI Z, ZHANG T, ZHOU P, et al Knowledge transfer for out-of-knowledge-base entities: improving graph-neural-network-based embedding using convolutional layers[J]. IEEE Access, 2020, 8: 159039- 159049
doi: 10.1109/ACCESS.2020.3019592
|
|
|
[20] |
SCARSELLI F, GORI M, TSOI A C, et al The graph neural network model[J]. IEEE Transactions on Neural Networks, 2008, 20 (1): 61- 80
|
|
|
[21] |
KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks. [EB/OL]. (2017-02-22). https://arxiv.org/abs/1609.02907.
|
|
|
[22] |
BOJANOWSKI P, GRAVE E, JOULIN A, et al. Bag of tricks for efficient text classification[C]// Association for Computational Linguistics. Valencia : ACL, 2017: 427-431.
|
|
|
[23] |
SHEN D, WANG G, WANG W, et al. Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms[EB/OL]. (2018-05-24). https://arxiv.org/abs/1805.09843.
|
|
|
[24] |
RAGESH R, SELLAMANICKAM S, IYER A, et al. Hetegcn: heterogeneous graph convolutional networks for text classification[C]// Proceedings of the 14th ACM International Conference on Web Search and Data Mining. Queensland: WSDM, 2021: 860-868.
|
|
|
[25] |
KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. (2014-12-22). https://arxiv.org/abs/1412.6980v3.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|