Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (5): 911-920    DOI: 10.3785/j.issn.1008-973X.2023.05.007
计算机技术与控制工程     
融合文本描述和层次类型的知识表示学习方法
李松1(),舒世泰1,郝晓红1,郝忠孝1,2
1. 哈尔滨理工大学 计算机科学与技术学院,黑龙江 哈尔滨 150080
2. 哈尔滨工业大学 计算机科学与技术学院,黑龙江 哈尔滨 150001
Knowledge representation learning method integrating textual description and hierarchical type
Song LI1(),Shi-tai SHU1,Xiao-hong HAO1,Zhong-xiao HAO1,2
1. School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
2. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
 全文: PDF(908 KB)   HTML
摘要:

现有的知识表示方法只考虑三元组本身或一种额外信息,没有充分利用外部信息对知识表示进行语义补充, 为此提出一种融合文本描述信息和层次类型信息的知识表示学习方法.使用卷积神经网络(CNN)从文本中提取特征信息;使用基于注意力机制的卷积神经网络区分不同关系的特征可信度,以增强实体关系结构向量在现有知识图谱中的表示,获得丰富的语义信息;使用加权层次编码器来构造层次类型投影矩阵,将实体的所有层次类型投影矩阵与特定关系类型约束结合起来.在WN18、WN18RR、FB15K、FB15K-237和YAGO3-10数据集上,进行链接预测和三元组分类等任务,以分析和验证所提模型的有效性. 实验结果表明: 在实体预测实验中,所提模型与TransD模型相比,MeanRank(Filter)降低了11.8%,Hits@10提升了3.5%;在三元组分类实验中,所提模型的分类精度比DKRL模型提高了8.4%,比TKRL模型提升了8.5%,充分证明利用外部多源信息能够提高知识表示能力.

关键词: 知识图谱知识表示多源信息融合表示学习链接预测    
Abstract:

Existing knowledge representation methods only consider triplet itself or one kind of additional information, and do not make use of external information to semantic supplement knowledge representation. The convolutional neural network was used to extract feature information from text. The convolutional neural network based on attention mechanism was used to distinguish the feature reliability of different relationships, enhance the representation of entity relationship structure vector in the existing knowledge graph and obtain rich semantic information. A weighted hierarchical encoder which combined all the hierarchical type projection matrix of the entity with the relationship-specific type constraints, was used to construct the projection matrix of the hierarchical type, The link prediction and the triplet classification were performed on WN18, WN18RR, FB15K, FB15K-237 and YAGO3-10 datasets to analyze and verify the validity of the proposed model. The experiment showed that in the entity prediction experiment, the proposed model reduced the MeanRank(Filter) by 11.8% compared to the TransD model, and increased Hits@10 by 3.5%. In the triple classification experiment, the classification accuracy of the proposed model was 8.4% higher than the DKRL model and 8.5% higher than the TKRL model, which fully proved that the ability of knowledge representation could be improved by using external multi-source information.

Key words: knowledge graph    knowledge representation    multi-source information combination    expression learning    link prediction
收稿日期: 2022-05-08 出版日期: 2023-05-09
CLC:  TP 391  
基金资助: 国家自然科学基金资助项目(61872105, 62072136);黑龙江省自然科学基金资助项目(LH2020F047);黑龙江省留学归国人员科学基金资助项目(LC2018030);河南省科技攻关项目(232102210068);国家重点研发计划资助项目(2020YFB1710200)
作者简介: 李松(1977—),男,教授,博士,从事时空数据库、知识图谱研究. orcid.org/0000-0002-3239-0504. E-mail: lisongbeifen@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
李松
舒世泰
郝晓红
郝忠孝

引用本文:

李松,舒世泰,郝晓红,郝忠孝. 融合文本描述和层次类型的知识表示学习方法[J]. 浙江大学学报(工学版), 2023, 57(5): 911-920.

Song LI,Shi-tai SHU,Xiao-hong HAO,Zhong-xiao HAO. Knowledge representation learning method integrating textual description and hierarchical type. Journal of ZheJiang University (Engineering Science), 2023, 57(5): 911-920.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.05.007        https://www.zjujournals.com/eng/CN/Y2023/V57/I5/911

图 1  ETLKRL模型整体框架
图 2  在Freebase中文本描述的示例
图 3  层次类型示例
数据集 N
实体 关系 训练集 验证集 测试集
WN18 40 943 18 141 442 5 000 5 000
WN18RR 40 943 11 93 003 5 000 5 000
FB15K 14 951 1 345 483 142 50 000 59 071
FB15K-237 14 541 237 272 115 17 535 20 466
YAGO3-10 123 182 37 302 710 500 000 500 000
表 1  各数据集的数据数量统计
模型 WN18 FB15K
MeanRank Hits@10/% MeanRank Hits@10/%
Raw Filter Raw Filter Raw Filter Raw Filter
TransE 263 251 75.4 89.2 243 125 34.9 47.1
TransH 318 303 75.4 86.5 210 81 41.6 59.0
TransR 238 225 79.8 92.0 198 77 48.2 68.7
TransD 224 212 79.6 92.2 194 91 53.4 77.3
DistMult 902 ? 93.6 ? 97 ? 82.4 ?
ConvE 504 ? 95.5 ? 64 ? 87.3 ?
DKRL ? ? ? ? 200 113 44.3 57.6
TKRL ? ? ? ? 202 87 50.3 73.4
ETLKRL 213 187 94.7 95.7 52 44 85.7 87.4
表 2  在WN18和FB15K数据集上不同模型实体预测的评估结果
模型 预测左实体 预测右实体
1-1 1-N N-1 N-N 1-1 1-N N-1 N-N
SE 35.6 62.6 17.2 37.5 34.9 14.6 68.3 41.3
TransE 43.7 65.7 18.2 47.2 43.7 19.7 66.7 50.0
TransH(bern) 66.8 87.6 28.7 64.5 65.5 39.8 83.3 67.2
TransR(bern) 78.8 89.2 34.1 69.2 79.2 37.4 90.4 72.1
TransD(bern) 86.1 95.5 39.8 78.5 85.4 50.6 94.4 81.2
ETLKRL 90.3 94.7 48.0 88.4 76.5 62.5 90.2 89.4
表 3  在FB15K数据集上各类关系的Hits@10值
模型 MeanRank Hits@10/% Hits@3/% Hits@1/%
TransE 7 113 42 25 12
ConvE 2 792 66 56 45
ComplEx 6 351 55 40 26
DistMult 5 926 54 38 24
ETLKRL 3 078 62 68 59
表 4  在YAGO3-10数据集上不同模型的链路预测结果
模型 MeanRank Hits@1/%
Raw Filter Raw Filter
TransE 2.91 2.53 69.5 90.2
TransH 8.25 7.91 60.3 72.5
DKRL(CBOW) 2.85 2.51 65.3 82.7
DKRL(CNN+TransE) 2.41 2.03 69.8 90.8
ETLKRL 2.23 1.86 71.7 93.9
表 5  在FB15K数据集上不同模型关系预测的评估结果
模型 ACC
WN18 FB15K
TransE 91.2 77.6
TransH 88.3 79.9
TransR 92.6 82.1
TransD 88.0
DKRL 92.8 86.3
TKRL 85.7
ETLKRL 96.7 93.6
表 6  在FB15K和WN18上不同模型的准确率结果
图 4  参数规模和迭代次数对模型的影响
1 CHEN X J, JIA S B, XIANG Y A review: knowledge reasoning over knowledge graph[J]. Expert Systems with Applications, 2020, 141: 1- 21
2 舒世泰, 李松, 郝晓红, 等 知识图谱嵌入技术研究进展[J]. 计算机科学与探索, 2021, 15 (11): 2048- 2062
SHU Shi-tai, Li Song, HAO Xiao-hong, et al Knowledge graph embedding technology: a review[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15 (11): 2048- 2062
3 刘知远, 孙茂松, 林衍凯, 等 知识表示学习研究进展[J]. 计算机研究与发展, 2016, 53 (2): 247- 261
LIU Zhi-yuan, SUN Mao-song, LIN Yan-kai, et al Research progress of knowledge representation learning[J]. Journal of Computer Research and Development, 2016, 53 (2): 247- 261
4 BORDES A, USUNIER N, GARCÍA-DURÁN A, et al. Translating embeddings for modeling multi-relational data [C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. Lake Tahoe: MITP, 2013: 2787-2795.
5 WANG Z, ZHANG J W, FENG J L, et al. Knowledge graph embedding by translating on hyperplanes [C]// Proceedings of the 28th AAAI Conference on Artificial Intelligence. Québec: AAAI, 2014: 1112-1119.
6 LIN Y K, LIU Z Y, SUN M S, et al. Learning entity and relation embeddings for knowledge graph completion [C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Austin: AAAI, 2015: 2181-2187.
7 JI G L, HE S Z, XU L H, et al. Knowledge graph embedding via dynamic mapping matrix [C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing. Beijing: ACL, 2015: 687-696.
8 TransA: an adaptive approach for knowledge graph embedding [EB/OL].[2022-05-02]. http://arxiv.org/abs/1509.05490.
9 XIAO H, HUANG M L, ZHU X Y. TransG: a generative model for knowledge graph embedding [C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: ACL, 2016: 2316-2325.
10 WANG Z, ZHANG J W, FENG J L, et al. Knowledge graph and text jointly embedding[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014: 1591-1601.
11 XIE R B, LIU Z Y, JIA J, et al. Representation learning of knowledge graphs with entity descriptions [C]// Proceedings of the 30th National Conference on Artificial Intelligence. Phoenix: AAAI, 2015: 2659-2665.
12 XIE R B, LIU Z Y, SUN M S. Representation learning of knowledge graphs with hierarchical types [C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York: AAAI, 2016: 2965-2971.
13 JI S X, PAN S R, ERIK C, et al A survey on knowledge graphs: representation, acquisition and applications[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33 (2): 494- 514
14 ZENG D J, LIU K, CHEN Y B, et al. Distant supervision for relation extraction via piecewise convolutional neural networks [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: ACL, 2014: 1753-1762.
15 JIANG X T, WANG Q, LI P, et al. Relation extraction with multi-instance multi-label convolutional neural networks [C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. Osaka: COC, 2016: 1471-1480.
16 HAN X, YU P F, LIU Z Y, et al. Hierarchical relationextraction with coarse-to-fine grained attention [C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: ACL, 2018: 2236-2245.
17 ZHANG N Y, DENG S M, SUN Z L, et al. Long-tail relation extraction via knowledge graph embeddings and graph convolution networks [C]// Proceedings of the 2019 Conference of the NAACL: Human Language Technologies. Minneapolis: ACL, 2019: 3016-3025.
18 TANG X, CHEN L, CUI J, et al Knowledge representation learning with entity descriptions, hierarchical types, and textual relations[J]. Information Processing and Management, 2019, 56: 809- 822
doi: 10.1016/j.ipm.2019.01.005
19 FAN M, ZHOU Q, CHANG E, et al. Transition-based knowledge graph embedding with relational mapping properties [C]// Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation. Phuket: ACL, 2014: 328-337.
20 FENG J, HUANG M L, WANG M D, et al. Knowledge graph embedding by flexible translation [C]// Proceedings of the 15th International Conference on Principles of Knowledge Representation and Reasoning. Cape Town: AAAI, 2016: 557-560.
21 NICKEL M, TRESP V, KRIEGEL H. A three-way model for collective learning on multi-relational data [C]// Proceedings of the 28th International Conference on Machine Learning. Washington: ACM, 2011: 809-816.
22 YANG B S, YIH W, HE X D, et al. Embedding entities and relations for learning and inference in knowledge bases [C]// Proceedings of the 3rd International Conference on Learning Representations. San Diego: [s.n.], 2015: 1-12.
23 TROUILLON T, WELBL J, RIEDEL S, et al. Complex embeddings for simple link prediction [C]// Proceedings of the 33rd International Conference on Machine Learning. New York: IMLS, 2016: 2071-2080.
24 ZHANG Z, ZHUANG F Z, QU M, et al. Knowledge graph embedding with hierarchical relation structure [C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: ACL, 2018: 3198-3207.
25 FENG J, HUANG M L, YANG Y, et al. GAKE: Graph aware knowledge embedding [C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. Osaka: COC, 2016: 641-651.
26 XIE R B, LIU Z Y, LUAN H B, et al. Image embodied knowledge representation learning [C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne: AAAI, 2017: 3140-3146.
27 TOUTANOVA K, LIN X V, YIH W T, et al. Compositional learning of embeddings for relation paths in knowledge base and text [C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: ACL, 2016: 1434-1444.
28 夏光兵, 李瑞轩, 辜希武, 等 融合多源信息的知识表示学习[J]. 计算机科学与探索, 2022, 16 (3): 591- 597
XIA Guang-bing, LI Rui-xuan, GU Xi-wu, et al Knowledge representation learning based on multi-source information combination[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16 (3): 591- 597
29 杜文倩, 李弼程, 王瑞 融合实体描述及类型的知识图谱表示学习方法[J]. 中文信息学报, 2020, 34 (7): 50- 59
DU Wen-qing, LI Bi-cheng, WANG Rui Representation learning of knowledge graph integrating entity description and entity type[J]. Journal of Chinese Information Processing, 2020, 34 (7): 50- 59
30 WANG P, ZHOU J JECI++: a modified joint knowledge graph embedding model for concepts and instances[J]. Big Data Research, 2021, 24: 1- 10
31 ZHAO F, XU T, JIN L J Q, et al Convolutional network embedding of text-enhanced representation for knowledge graph completion[J]. IEEE Internet of Things Journal, 2021, 8 (23): 16758- 16769
doi: 10.1109/JIOT.2020.3039750
32 MAHDISOLTANI F, BIEGA J, SUCHANEK F. Yago3: A knowledge base from multilingual wikipedias [C]// Proceedings of the 7th Biennial Conference on Innovative Data Systems Research. Asilomar: [s. n.], 2015: 1-11.
[1] 邢雪琪,丁雨童,夏唐斌,潘尔顺,奚立峰. 基于知识图谱的商用飞机维修方案推荐系统集成建模[J]. 浙江大学学报(工学版), 2023, 57(3): 512-521.
[2] 苏丰龙,景宁. 基于关系聚合的时序知识图谱表示学习[J]. 浙江大学学报(工学版), 2023, 57(2): 235-242.
[3] 王亚峰,周丽华,陈伟,王丽珍,陈红梅. 异质信息网络的互信息最大化社区搜索[J]. 浙江大学学报(工学版), 2023, 57(2): 287-298.
[4] 马骏驰,迪骁鑫,段宗涛,唐蕾. 程序表示学习综述[J]. 浙江大学学报(工学版), 2023, 57(1): 155-169.
[5] 凤丽洲,杨阳,王友卫,杨贵军. 基于Transformer和知识图谱的新闻推荐新方法[J]. 浙江大学学报(工学版), 2023, 57(1): 133-143.
[6] 陈成,张皞,李永强,冯远静. 关系生成图注意力网络的知识图谱链接预测[J]. 浙江大学学报(工学版), 2022, 56(5): 1025-1034.
[7] 赵廷廷,王喆,卢奕南. 基于传播概率矩阵的异构信息网络表示学习[J]. 浙江大学学报(工学版), 2019, 53(3): 548-554.
[8] 张林, 程华, 房一泉. 基于卷积神经网络的链接表示及预测方法[J]. 浙江大学学报(工学版), 2018, 52(3): 552-559.
[9] 戴彩艳, 陈崚, 李斌, 陈伯伦. 复杂网络中的抽样链接预测[J]. 浙江大学学报(工学版), 2017, 51(3): 554-561.
[10] 郭景峰,刘苗苗,罗旭. 加权网络中基于多路径节点相似性的链接预测[J]. 浙江大学学报(工学版), 2016, 50(7): 1347-1352.