基于多重过滤草图的网络超点测量算法

doi:10.3785/j.issn.1008-973X.2025.02.007

浙江大学学报(工学版)

2025, Vol. 59

Issue (2): 289-299 DOI: 10.3785/j.issn.1008-973X.2025.02.007

计算机技术

基于多重过滤草图的网络超点测量算法

李卓1,2,3,4(

),孟晋虎1,4,林盛彦1,高源1,谷大伟1,尤辰至1,刘开华4,5

1. 天津大学微电子学院，天津 300072
2. 鹏城国家实验室，广东深圳 518000
3. 天津市成像与感知微电子技术重点实验室，天津 300072
4. 天津市数字信息技术研究中心，天津 300072
5. 天津仁爱学院信息与智能工程学院，天津 301636

Network superspreader measurement algorithm based on multiple filter sketch

Zhuo LI1,2,3,4(

),Jinhu MENG1,4,Shengyan LIN1,Yuan GAO1,Dawei GU1,Chenzhi YOU1,Kaihua LIU4,5

1. School of Microelectronics, Tianjin University, Tianjin 300072, China
2. The Peng Cheng Laboratory, Shenzhen 518000, China
3. Tianjin Microelectronics Technology Key Laboratory of Imaging and Perception, Tianjin 300072, China
4. Tianjin Digital Information Technology Research Center, Tianjin 300072, China
5. School of Information and Intelligent Engineering, Tianjin Ren’ai College, Tianjin 301636, China

全文: PDF(1716 KB) HTML

摘要：

针对典型草图无法直接应用到超点测量的问题，提出新型的多重过滤草图 (MFS)，给出理论误差分析. MFS以前置分流过滤器来去除重复数据包，在兼顾流频率测量任务的同时完成超点测量. 后级统计过滤器划分为3级子结构，分别用来捕捉不同大小的流，以适应现实中偏态的网络流量分布特征，提升草图的内存利用率及测量精度. 提出非对称的插入和查询算法 (AIQ)，该算法利用流量的偏态分布特征，快速完成插入和查询操作，满足链路转发的速度要求. 实验结果表明，与当下的先进方案相比，所提的多重过滤草图的流频率测量精度提升了2~48倍，大象流测量精度提升了3~53倍，超点测量精度提升了2.0~3.0倍，测量吞吐量与经典草图相当.

关键词： 网络测量; 超点测量; 草图

Abstract:

A novel multiple filter sketch (MFS) was proposed and analyzed for theoretical errors to address the problem that the classic sketches cannot be directly applied to superspreader measurement. The MFS removed repetitive packets by a front-end streaming filter, so that the superspreader measurement can be finished while considering the flow size estimation. The post-stage statistical filter was divided into three levels of substructures to capture flows with different sizes so as to adapt to the skewed network traffic distribution characteristics in reality. The memory utilization and the measurement accuracy of the sketch were improved. An asymmetric insertion and query algorithm (AIQ) was proposed to quickly finish the insertion and query operations by using the skewed distribution characteristics of the traffic, which can meet the requirement of link forwarding speed. The experimental results show that the proposed multiple filter sketch improves the measurement accuracy of flow size by approximately 2 to 48 times, the measurement accuracy of heavy hitters by approximately 3 to 53 times, and the measurement accuracy of superspreader by about 2.0 to 3.0 times, compared with the current state-of-the-art schemes. The measurement throughput of MFS can be comparable to that of the classical sketch.

Key words: network measurement superspreader measurement sketch

收稿日期: 2023-12-03 出版日期: 2025-02-11

CLC:

TP 393

基金资助: 国家重点研发计划资助项目(2022YFB2901100).

作者简介: 李卓（1984—），男，副教授，博士，从事网络流量工程的研究. orcid.org/0000-0002-5535-5920. E-mail：zli@tju.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	李卓
	孟晋虎
	林盛彦
	高源
	谷大伟
	尤辰至
	刘开华

引用本文:

李卓,孟晋虎,林盛彦,高源,谷大伟,尤辰至,刘开华. 基于多重过滤草图的网络超点测量算法[J]. 浙江大学学报(工学版), 2025, 59(2): 289-299.

Zhuo LI,Jinhu MENG,Shengyan LIN,Yuan GAO,Dawei GU,Chenzhi YOU,Kaihua LIU. Network superspreader measurement algorithm based on multiple filter sketch. Journal of ZheJiang University (Engineering Science), 2025, 59(2): 289-299.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.02.007 或 https://www.zjujournals.com/eng/CN/Y2025/V59/I2/289

图 1 统计过滤器的结构

算法 1　AIQ插入算法
Input: 流标识为$ f $的数据包
1: Procedure INSERT($ f $,1) 2:　for i=1 to $ d $ 3:　　 $ {r_i} \leftarrow {h_i}(f) $ 4: 　　if L₁[$ {r_i} $]. key =$ f $ then 5:　　　if L₁[$ {r_i} $].$ {v_e} $<$ {\tau _1} - 1 $ then 6: 　　　L₁[$ {r_i} $].$ {v_e} $ $ \leftarrow $ L₁[$ {r_i} $].$ {v_e} $+1 7: 　　else if L₁[$ {r_i} $].$ {v_e} $=$ {\tau _1} - 1 $ then 8：　　　　L₁[$ {r_i} $].$ {v_e} $ $ \leftarrow $ L₁[$ {r_i} $].$ {v_e} $+1 9:　　　　将$ (f,{\tau _1}) $插入L₂ 10:　　　else //此时为插入前计数器溢出 11:　　　　将$ (f,1) $插入L₂ 12:　if $ d $个桶中均无匹配流标识 13:　　将$ f $插入CU 14: function INSERT_L₂($ f $,$ v $) //插入L₂ 15: 　 if $ v $=$ {\tau _1} $ then 16:　　利用$ {r_1} $和遍历的方式在L₂查找空桶 17:　　if L₂中存在空桶 18:　　　 $ {\mathrm{key}} \leftarrow f,{v_e} \leftarrow {\tau _1} $ 19:　　else 20:　　　放弃在L₂中的插入 21:　 else 22:　　利用$ {r_1} $和遍历的方式在L₂查找匹配$ f $的桶 23:　　if L₂中存在匹配$ f $的桶 24:　　　 ${v_e}\leftarrow {v_e}+1 $ 25:　　else 26:　　　放弃在L₂中的插入 27: function INSERT_CU($ f $) //插入CU 28:　 $ {\mathrm{cu}} $$ \leftarrow $min (CU[$ {r_i} $]) //$ i $从1到$ d $, min用于求解最小值 29:　 if $ {\mathrm{cu}} \lt {\tau _0} - 1 $ then 30:　　CU[j] $ \leftarrow $ CU[j]+1 //j指$ d $个计数器中最小的几个计数器 31:　 else if $ {\mathrm{cu}} $=$ {\tau _0} - 1 $ then 32:　　CU[j] $ \leftarrow $ CU[j]+1 33:　　将$ (f,{\tau _0}) $插入L₁ 34:　 else //插入前溢出，因此确定$ f $在L₁的$ {v_c} $部分 35:　将$ (f,1) $再次插入L₁ 36: function INSERT_L₁($ f $,$ v $) //插入L₁ 37:　 if $ v $=$ {\tau _0} $ then 38:　　if L₁[$ {r_i} $]存在空桶k then 39:　　　L₁[$ k $].$ {\text{key}} $$ \leftarrow $$ f $, L₁[$ k $].$ {v_e} $$ \leftarrow $ 1 40:　　else 41:　　　L₁[$ j $].$ {v_c} $$ \leftarrow $ L₁[$ j $].$ {v_c} $+1 //$ j $指$ d $个$ {v_c} $ 中最小的几个$ {v_c} $ 42:　 else 43:　　 $ l \leftarrow $与min (L₁[$ {r_i} $].$ {v_e} $)对应桶索引 44:　　 $ m \leftarrow $与min (L₁[$ {r_i} $].$ {v_c} $)对应桶索引 45:　　L₁[$ m $].$ {v_c} $← L₁[$ m $].$ {v_c} $+1 46:　　 $ {X_{\min }} \leftarrow \min $ (L₁[$ {r_i} $].$ {v_e} $) 47:　　 $ {Y_{\min }} \leftarrow \min $ (L₁[$ {r_i} $].$ {v_c} $) 48:　　if $ {Y_{\min }} \gt \lambda {X_{\min }} $ then 49:　　　桶$ l \leftarrow (f,{Y_{\min }}) $ 50:　　　 $ {f_m} $的值$ {X_{\min }} $插入另外$ d $个$ {v_c} $

图 2 多重过滤草图的整体框架

图 3 Web数据集下$ \lambda $的参数实验

图 4 流频率的测量准确性

表 1 Web数据集下流频率测量的平均相对误差

表 2 偏度数据集下流频率测量的平均相对误差

图 5 大象流的测量准确性

表 3 Web数据集下大象流测量的平均相对误差

图 6 超点的测量准确性

表 4 CAIDA数据集下超点测量的平均相对误差

图 7 统计过滤器的插入吞吐量

图 8 统计过滤器的查询吞吐量

图 9 超点测量中的插入吞吐量

1	WANG R, DU H, SHEN Z, et al DAP-sketch: an accurate and effective network measurement sketch with deterministic admission policy[J]. Computer Networks, 2021, 194: 108155 doi: 10.1016/j.comnet.2021.108155
2	GENG N, XU M, YANG Y, et al. Adaptive and low-cost traffic engineering based on traffic matrix classification [C]// International Conference on Computer Communications and Networks. Honolulu: IEEE, 2020: 1-9.
3	YANG T, JIANG J, LIU P, et al Adaptive measurements using one elastic sketch[J]. IEEE/ACM Transactions on Networking, 2019, 27 (6): 2236- 2251 doi: 10.1109/TNET.2019.2943939
4	YANG K, LI Y, LIU Z, et al. SketchINT: empowering int with towersketch for per-flow per-switch measurement [C] // International Conference on Network Protocols. Dallas: IEEE, 2021: 1-12.
5	DING R, YANG S, CHEN X, et al. Bitsense: universal and nearly zero-error optimization for sketch counters with compressive sensing [C]// Proceedings of the ACM SIGCOMM 2023 Conference . New York: ACM, 2023: 220-238.
6	LU J, CHEN H, SUN P, et al OrderSketch: an unbiased and fast sketch for frequency estimation of data streams[J]. Computer Networks, 2021, 201: 108563 doi: 10.1016/j.comnet.2021.108563
7	ZHENG H, TIAN C, YANG T, et al. Flymon: enabling on-the-fly task reconfiguration for network measurement [C]// Proceedings of the ACM SIGCOMM 2022 Conference. Honolulu: ACM, 2022: 486-502.
8	DAI Y, WANG A, GUO Y, et al Elastically augmenting the control-path throughput in SDN to deal with internet DDoS attacks[J]. ACM Transactions on Internet Technology, 2023, 23 (1): 1- 25
9	JONATH K, LIU Z. Analysis of cyber security rebuts and its rising aims on current technologies [C]// International Conference on Computer Network Security and Software Engineering. Zhuhai: IEEE, 2022: 134-141.
10	SONG G, HE L, ZHAO T, et al. Which doors are open: reinforcement learning-based internet-wide port scanning [C]// International Symposium on Quality of Service. Orlando: IEEE, 2023: 1-10.
11	LIU L, DING T, FENG H, et al Tree sketch: an accurate and memory efficient sketch for network-wide measurement[J]. Computer Communications, 2022, 194: 148- 155 doi: 10.1016/j.comcom.2022.07.009
12	TANG L, XIAO Y, HUANG Q, et al A high-performance invertible sketch for network-wide superspreader detection[J]. IEEE/ACM Transactions on Networking, 2023, 31 (2): 724- 737 doi: 10.1109/TNET.2022.3198738
13	ZHOU Y, JIN H, LIU P, et al. Accurate per-flow measurement with bloom sketch [C] // IEEE Conference on Computer Communications Workshops. Honolulu: IEEE, 2018: 1-2.
14	LI Y, YU X, YANG Y, et al Pyramid family: generic frameworks for accurate and fast flow size measurement[J]. IEEE/ACM Transactions on Networking, 2021, 30 (2): 586- 600
15	CORMODE G, MUTHUKRISHNAN S An improved data stream summary: the count-min sketch and its applications[J]. Journal of Algorithms, 2005, 55 (1): 58- 75 doi: 10.1016/j.jalgor.2003.12.001
16	ESTAN C, VARGHESE G New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice[J]. ACM Transactions on Computer Systems, 2004, 21 (3): 44
17	CHARIKAR M, CHEN K, FARACHCOLTON M Finding frequent items in data streams[J]. Theoretical Computer Science, 2004, 312 (1): 3- 15 doi: 10.1016/S0304-3975(03)00400-6
18	YANG T, GAO S, SUN Z, et al Diamond sketch: accurate per-flow measurement for big streaming data[J]. IEEE Transactions on Parallel and Distributed Systems, 2019, 30 (12): 2650- 2662 doi: 10.1109/TPDS.2019.2923772
19	LI J, LI Z, XU Y, et al. WavingSketch: an unbiased and generic sketch for finding top-k items in data streams [C] // 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event. California: ACM, 2020: 1574-1584.
20	ESTAN C, VARGHESE G, FISK M. Bitmap algorithms for counting active flows on high speed links [C]// Proceedings of the 3rd ACM SIGCOMM Conference on Internet Measurement . San Francisco: ACM, 2003: 153-166.
21	COHEN S, MATIAS Y. Spectral bloom filters [C]// Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data . San Francisco: AC-M, 2003: 241-252.
22	EINZIGER G, FRIEDMAN R. A formal analysis of conservative update based approximate counting [C]// International Conference on Computing, Networking and Communications. Anaheim: IEEE, 2015: 255-259.
23	CLAUDIO L, SALVATORE O, RAFFAELE P, et al. Webdocs [EB/OL]. (2018-07-16)[2023-11-29]. http://fimi.uantwerpen.be/data/.
24	CAIDA. The CAIDA anonymized Internet traces 2018 dataset [EB/OL]. (2018-09-19)[2023-11-29]. https://www.caida.org/catalog/datasets/passive_dataset/.

[1]	张伊璇,龚俭. 基于DNS流量的多层多域名检测与测量[J]. 浙江大学学报(工学版), 2020, 54(12): 2423-2429.
[2]	刘征, 鲁娜, 吴剑锋. 基于设计认知的草图研究综述[J]. J4, 2010, 44(12): 2376-2382.
[3]	黄琦孙守迁. 基于意象认知模型的汽车草图设计技术研究[J]. J4, 2006, 40(4): 553-559.

Viewed

Full text

Abstract

Cited

Shared

Discussed