Please wait a minute...
浙江大学学报(工学版)  2024, Vol. 58 Issue (3): 449-458    DOI: 10.3785/j.issn.1008-973X.2024.03.002
计算机技术     
SL-tgStore:新的时序知识图谱存储模型
李松(),王哲,张丽平
1. 哈尔滨理工大学 计算机科学与技术学院,黑龙江 哈尔滨 150080
SL-tgStore: new temporal knowledge graph storage model
Song LI(),Zhe WANG,Liping ZHANG
1. School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
 全文: PDF(1383 KB)   HTML
摘要:

为了解决时序知识图谱的存储问题,提出结合快照和日志模式的时序知识图谱存储模型SL-tgStore. 模型由若干时间桶组成,每个时间桶由一系列的时间窗口组成. 在首个时间窗口引入初始快照作为时序知识图谱存储和处理的基本单元,在接下来的时间窗口存储为增量日志. 提出相应的阈值来确定初始快照的生成,即生成一个新的时间桶,以达到初始快照数量与增量日志数量的平衡,并提出临时快照生成算法. 所提模型能够有效解决快照存储模式消耗内存大,日志存储模式查询效率低的问题. 为了对SL-tgStore模型进行高效查询,在此基础上提出4种索引结构. 在4个真实数据集上进行实验,理论研究与实验结果表明所提出的SL-tgStore存储模型具有高效性.

关键词: 时序知识图谱资源描述框架(RDF)存储模型日志模式快照模式    
Abstract:

A storage model of temporal knowledge graph combining snapshot and log modes, which was called SL-tgStore, was proposed, in order to solve the storage problem of temporal knowledge graph. The model was consisted of several time buckets, and each time bucket was composed of a series of time windows. The initial snapshot was introduced by the first time window as the basic unit of temporal knowledge graph storage and processing, and it was stored as an incremental log in the following time window. The corresponding threshold was proposed to determine the generation of the initial snapshot, that is, a new time bucket was generated to achieve the balance between the number of initial snapshots and the number of incremental logs, and a temporary snapshot generation algorithm was proposed. The proposed model can effectively solve the problems of large memory consumption in snapshot storage mode and low query efficiency in log storage mode. Four index structures were proposed on this basis, in order to query the SL-tgStore model efficiently. Experiments were carried out on four real datasets, and the theoretical and experimental results showed that the proposed SL-tgStore storage model was efficient.

Key words: temporal knowledge graph    resource description framework (RDF)    storage model    log mode    snapshot mode
收稿日期: 2023-01-31 出版日期: 2024-03-05
CLC:  TP 391  
基金资助: 国家自然科学基金资助项目(62072136);黑龙江省自然科学基金资助项目(LH2023F031);国家重点研发计划资助项目(2020YFB1710200).
作者简介: 李松(1977—),男,教授,博士,从事数据查询、知识图谱研究. orcid.org/0000-0002-3239-0504. E-mail:lisongbeifen@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
李松
王哲
张丽平

引用本文:

李松,王哲,张丽平. SL-tgStore:新的时序知识图谱存储模型[J]. 浙江大学学报(工学版), 2024, 58(3): 449-458.

Song LI,Zhe WANG,Liping ZHANG. SL-tgStore: new temporal knowledge graph storage model. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 449-458.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2024.03.002        https://www.zjujournals.com/eng/CN/Y2024/V58/I3/449

图 1  时序知识图谱示意图
图 2  SL-tgStore存储结构
图 3  RDF快照存储
图 4  RDF日志存储
图 5  Ttg-hash索引示意图
图 6  Vtg-tree索引示意图
图 7  Ptg-hash索引示意图
图 8  Ltg-tree索引示意图
数据集|V|/M|E|/M|T|
GDELT2.332.615 min
ICEWS1.225.2每天
Wikidata1.17.8每年
YAGO2.944.5每年
表 1  实验数据集信息
图 9  不同θ下的内存开销与查询时间对比
图 10  索引对查询时间的影响
图 11  不同模型的内存开销和查询时间对比
1 蒋逸, 张伟, 王佩, 等 基于互联网群体智能的知识图谱构造方法[J]. 软件学报, 2022, 33 (7): 2646- 2666
JIANG Yi, ZHANG Wei, WANG Pei, et al Knowledge graph construction method via internet-based collective intelligence[J]. Journal of Software, 2022, 33 (7): 2646- 2666
doi: 10.13328/j.cnki.jos.006313
2 SHEN X, YI L, JIANG X, et al Neighbor affinity based algorithm for discovering temporal protein complex from dynamic PPI network[J]. Methods, 2016, 110: 90- 96
doi: 10.1016/j.ymeth.2016.06.010
3 YU C, ZHANG Z, LIN C, et al Can data-driven precision marketing promote user AD clicks? evidence from advertising in WeChat moments[J]. Industrial Marketing Management, 2020, 90: 481- 492
doi: 10.1016/j.indmarman.2019.05.001
4 李松, 舒世泰, 郝晓红, 等 融合文本描述和层次类型的知识表示学习方法[J]. 浙江大学学报: 工学版, 2023, 57 (5): 911- 920
LI Song, SHU Shitai, HAO Xiaohong, et al Knowledge representation learning method integrating textual description and hierarchical type[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (5): 911- 920
5 VRANDEčIć D, KRöTZSCH M Wikidata: a free collaborative knowledge base[J]. Communications of the ACM, 2014, 57 (10): 78- 85
doi: 10.1145/2629489
6 SUCHANEK F M, KASNECI G, WEIKUM G YAGO: a large ontology from Wikipedia and Wordnet[J]. Journal of Web Semantics, 2008, 6 (3): 203- 217
7 BIZER C, LEHMANN J, KOBILAROV G, et al Dbpedia-a crystallization point for the web of data[J]. Journal of Web Semantics, 2009, 7 (3): 154- 165
doi: 10.1016/j.websem.2009.07.002
8 JI S, PAN S, CAMBRIA E, et al A survey on knowledge graphs: representation, acquisition, and applications[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33 (2): 494- 514
9 HARRIS S, GIBBINS N. 3store: efficient bulk RDF storage [C]// Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems. Sanibel Island: [s.n.], 2004.
10 PAN Z, HEFLIN J. DLDB: extending relational databases to support semantic Web queries [C]// Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems. Sanibel Island: [s.n.], 2004.
11 WILKINSON K. Jena property table implementation [C]// Proceedings of the 2nd International Workshop on Scalable Semantic Web Knowledge Base Systems. Athens: [s.n.], 2006.
12 ABADI D J, MARCUS A, MADDEN S R, et al SW-Store: a vertically partitioned DBMS for semantic web data management[J]. The LtgDB Journal, 2009, 18 (2): 385- 406
13 NEUMANN T, WEIKUM G RDF-3X: a risc-style engine for RDF[J]. PLtgDB, 2008, 1: 647- 659
14 YUAN P, LIU P, WU B, et al Triple Bit: a fast and compact system for large scale RDF data[J]. PLtgDB, 2013, 6: 517- 528
15 WEBBER J. A programmatic introduction to neo4j [C]// Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity. New York: Association for Computing Machinery, 2012: 217-218.
16 REN C, LO E, KAO B, et al Efficient processing of shortest path queries in evolving graph sequences[J]. Information Systems, 2017, 70: 18- 31
doi: 10.1016/j.is.2017.05.004
17 张天成, 田雪, 孙相会, 等 知识图谱嵌入技术研究综述[J]. 软件学报, 2023, 34 (1): 277- 311
ZHANG Tiancheng, TIAN Xue, SUN Xianghui, et al Overview on knowledge graph embedding technology research[J]. Journal of Software, 2023, 34 (1): 277- 311
doi: 10.13328/j.cnki.jos.006429
18 HAN W, MIAO Y, LI K, et al. Chronos: a graph engine for temporal graph analysis [C]// Proceedings of the 9th European Conference on Computer Systems. Amsterdam: ACM, 2014.
19 KHURANA U, DESHPANDE A. Efficient snapshot retrieval over historical graph data [C]// IEEE 29th International Conference on Data Engineering. Brisbane: IEEE, 2013.
20 HAUBENSCHILD M, THEN M, HONG S, et al. ASGraph: a mutable multi-versioned graph container with high analytical performance [C]// Proceedings of the 4th International Workshop on Graph Data Management Experiences and Systems. Redwood Shores: ACM, 2016.
21 YING T, CHEN H, JIN H. Pensieve: skewness-aware version switching for efficient graph processing [C]// Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. Portland: ACM, 2020.
22 MASSRI M, MIKLOS Z, RAIPIN P, et al. Clock-G: a temporal graph management system with space-efficient storage technique [C]// 2022 IEEE 38th International Conference on Data Engineering. Kuala Lumpur: IEEE, 2022.
[1] 苏丰龙,景宁. 基于关系聚合的时序知识图谱表示学习[J]. 浙江大学学报(工学版), 2023, 57(2): 235-242.
[2] 徐昶, 寿黎但, 陈刚, 胡天磊. 一种基于闪存的数据库复合存储模型[J]. J4, 2012, 46(2): 294-300.