Please wait a minute...
浙江大学学报(工学版)  2025, Vol. 59 Issue (1): 70-78    DOI: 10.3785/j.issn.1008-973X.2025.01.007
计算机与控制工程     
面向周期边查询的高效图流概要技术
李卓1,2,3,4(),刘帅君1,4,刘开华4,5
1. 天津大学 微电子学院,天津 300072
2. 鹏城国家实验室,广东 深圳 518000
3. 天津市成像与感知微电子技术重点实验室,天津 300072
4. 天津市数字信息技术研究中心,天津 300072
5. 天津仁爱学院 信息与智能工程学院,天津 301636
Efficient graph stream summarization technology for periodic edge queries
Zhuo LI1,2,3,4(),Shuaijun LIU1,4,Kaihua LIU4,5
1. School of Microelectronics, Tianjin University, Tianjin 300072, China
2. Pengcheng Laboratory, Shenzhen 518000, China
3. Tianjin Microelectronics Technology Key Laboratory of Imaging and Perception, Tianjin 300072, China
4. Tianjin Digital Information Technology Research Center, Tianjin 300072, China
5. School of Information and Intelligent Engineering, Tianjin Ren’ai College, Tianjin 301636, China
 全文: PDF(1061 KB)   HTML
摘要:

当前图流概要技术不能在小内存下实现高效准确的图流测量,也无法完成周期边查询,为此提出面向周期边查询的图流概要技术——周期交互矩阵(PIM). PIM为混合结构,由存储重边的二维邻接矩阵和存储轻边的三维邻接矩阵组成,提高了内存效率. 二维邻接矩阵保留重边标识、权重和时间戳,实时完成包括周期边查询在内的多种查询任务. 设计基于权重和时间的替换策略,使用共享哈希技术以提高查询精度和插入查询效率. 实验结果表明,PIM在小内存下实时高效地完成了多种图流查询任务,能够准确地召回所有频繁边、频繁点和周期边. 对比当前图流概要技术,PIM将查询任务的平均相对误差降低了91.41%~99.54%.

关键词: 图流图流概要周期边测量实时查询邻接矩阵    
Abstract:

A graph stream summarization technology for periodic edge query named periodic interaction matrix (PIM) was proposed to address the problem that the current graph stream summarization technology cannot achieve efficient and accurate graph stream measurement under smaller memory and cannot complete periodic edge query. PIM was designed as a hybrid structure consisting of a two-dimensional adjacency matrix and a three-dimensional adjacency matrix. The heavy edges were stored by the two-dimensional adjacency matrix, the light edges were stored by the three-dimensional adjacency matrix, and the memory efficiency was enhanced. Heavy edge identifiers, weights, and timestamps were retained in the two-dimensional adjacency matrix to complete various query tasks in real-time, including periodic edge queries. A weight-based and time-based replacement strategy was designed, using a shared hashing technology to improve query accuracy and insertion query efficiency. Experimental results show that PIM can efficiently complete a variety of graph stream query tasks in real-time and with small memory, and can accurately recall all heavy hitter edges, heavy hitter nodes, and periodic edges. Compared to the current graph stream summarization technology, PIM reduces the average relative error of query tasks by 91.41%-99.54%.

Key words: graph stream    graph stream summarization    periodic edge measurement    real-time query    adjacency matrix
收稿日期: 2024-01-17 出版日期: 2025-01-18
CLC:  TP 393.0  
基金资助: 国家重点研发计划资助项目(2022YFB2901100,2022ZD0115303);鹏城实验室算力网重大攻关项目(PCL2023A06).
作者简介: 李卓(1984—),男,副教授,博士,从事网络流量工程研究. orcid.org/0000-0002-5535-5920. E-mail:zli@tju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
李卓
刘帅君
刘开华

引用本文:

李卓,刘帅君,刘开华. 面向周期边查询的高效图流概要技术[J]. 浙江大学学报(工学版), 2025, 59(1): 70-78.

Zhuo LI,Shuaijun LIU,Kaihua LIU. Efficient graph stream summarization technology for periodic edge queries. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 70-78.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.01.007        https://www.zjujournals.com/eng/CN/Y2025/V59/I1/70

图 1  周期交互矩阵的数据结构
图 2  三维邻接矩阵内存大小与总内存大小占比实验
图 3  不同图流概要技术在3个数据集中节点权重查询的平均相对误差
图 4  不同图流概要技术在3个数据集中频繁边查询的平均相对误差
图 5  不同图流概要技术在3个数据集中频繁点查询的平均相对误差
图 6  不同图流概要技术在3个数据集中周期边查询的平均相对误差
图 7  不同图流概要技术在3个数据集中频繁边、频繁点和周期边查询的调和平均数
方案t/s
插入节点权重查询频繁边
查询
频繁点
查询
周期边
查询
PIM1.51×10?72.41×10?53.42×10?36.073.47×10?3
PDMatrix2.34×10?74.81×10?51.10×10?24.442.88
PTCM2.01×10?72.08×10?53.7721.863.19
PCuckoo5.67×10?69.12×10?63.4913.194.99
Periodic4.08×10?84.15×10?3
表 1  不同图流概要技术在Network-Flow数据集中插入和查询操作的平均时间成本
1 PACACI A, BONIFATI A, ÖZSU M T. Regular path query evaluation on streaming graphs [C]// Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data . New York: ACM, 2020: 1415–1430.
2 SHAN Z G, SHI L, LI B, et al Empowering smart city situational awareness via big mobile data[J]. Frontiers of Information Technology and Electronic Engineering, 2023, 25: 286- 307
3 TIAN B, MORRIS B T, TANG M, et al Hierarchical and networked vehicle surveillance in ITS: a survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16 (2): 557- 580
doi: 10.1109/TITS.2014.2340701
4 ABHILASH C, MAHESH K. Graph analytics applied to COVID19 Karnataka state dataset [C]// Proceedings of the 4th International Conference on Information Science and Systems . Edinburgh: [s. n.], 2021: 74–80.
5 YU J, SUN Y E, HUANG H, et al. HeavyTracker: an efficient algorithm for heavy-hitter detection in high-speed networks [C]// 2022 IEEE 28th International Conference on Parallel and Distributed Systems . Nanjing: IEEE, 2023: 362–370.
6 CAI J Y, ZHOU Z Y, SUN T X, et al. MINT: empowering multiple flow definition query for network-wide measurement [C]// IEEE International Conference on Communications . Rome: IEEE, 2023: 1118–1123.
7 CHEN X, LIU H Y, SUN T X, et al. Excalibur: a scalable and low-cost traffic testing framework for evaluating DDoS defense solutions [C]// IEEE Conference on Computer Communications . New York: IEEE, 2023: 1–10.
8 TANG N, CHEN Q, MITRA P. Graph stream summarization: from big bang to big crunch [C]// Proceedings of the 2016 International Conference on Management of Data . San Francisco: ACM, 2016: 1481–1496.
9 KHAN A, AGGARWAL C Toward query-friendly compression of rapid graph streams[J]. Social Network Analysis and Mining, 2017, 7: 23
doi: 10.1007/s13278-017-0443-4
10 HOU C S, HOU B N, ZHOU T Q, et al DMatrix: toward fast and accurate queries in graph stream[J]. Computer Networks, 2021, 198: 108403
doi: 10.1016/j.comnet.2021.108403
11 GOU X Y, ZOU L, ZHAO C X Y, et al Graph stream sketch: summarizing graph streams with high speed and accuracy[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35 (6): 5901- 5914
doi: 10.1109/TKDE.2022.3174570
12 LI Z, LI Z R, FAN Z Y, et al Cuckoo matrix: a high efficient and accurate graph stream summarization on limited memory[J]. Electronics, 2023, 12 (2): 414
doi: 10.3390/electronics12020414
13 ALREHAILI M, ALSHAMRANI A. An attack scenario reconstruction approach using alerts correlation and a dynamic attack graph [C]// 2023 Eighth International Conference On Mobile and Secure Services . Miami: IEEE, 2023: 1–8.
14 HEJASE H J, FAYYAD-KAZAN H F, MOUKADEM I Advanced persistent threats (APT): an awareness review[J]. Journal of Economics and Economic Education Research, 2020, 21 (6): 1- 8
15 FAN Z C, ZHANG Y D, YANG T, et al. PeriodicSketch: finding periodic items in data streams [C]// 2022 IEEE 38th International Conference on Data Engineering . Kuala Lumpur: IEEE, 2022: 96–109.
16 SINGH K, BEST P Anti-money laundering: using data visualization to identify suspicious activity[J]. International Journal of Accounting Information Systems, 2019, 34: 100418
doi: 10.1016/j.accinf.2019.06.001
17 CHEN T, YIN H Z, CHEN H X, et al Online sales prediction via trend alignment-based multitask recurrent neural networks[J]. Knowledge and Information Systems, 2020, 62: 2139- 2167
doi: 10.1007/s10115-019-01404-8
18 CHEN M, ZHOU R X, CHEN H H, et al. Scube: efficient summarization for skewed graph streams [C]// 2022 IEEE 42nd International Conference on Distributed Computing Systems . Bologna: IEEE, 2022: 100–110.
19 BLOOM B H Space/time trade-offs in hash coding with allowable errors[J]. Communications of the ACM, 1970, 13 (7): 422- 426
doi: 10.1145/362686.362692
20 CAIDA. The CAIDA anonymized internet traces 2015 dataset [EB/OL]. [2024–01–15]. https://www.caida.org/catalog/datasets/passive_dataset/.
21 Wiki. Wikipedia talk dataset [EB/OL]. (2017–10–27)[2024–01–15]. http://konect.cc/networks/wiki_talk_en/.
[1] 江佳鸿,夏楠,李长吾,于鑫淼. 基于知识共享的遮挡人体姿态估计网络[J]. 浙江大学学报(工学版), 2024, 58(10): 2001-2010.
[2] 周传华,操礼春,周家亿,詹凤. 图卷积融合计算时效网络节点重要性评估分析[J]. 浙江大学学报(工学版), 2023, 57(5): 930-938.
[3] 张大尉, 朱善安. 基于核邻域保持判别嵌入的人脸识别[J]. J4, 2011, 45(10): 1842-1847.
[4] 高飞 潘双夏 冯培恩. 基于广义有向图的产品功能建模方法研究[J]. J4, 2005, 39(5): 648-651.