Please wait a minute...
Journal of Zhejiang University (Science Edition)  2022, Vol. 49 Issue (3): 271-279    DOI: 10.3785/j.issn.1008-9497.2022.03.002
Mathematics and Computer Science     
Representation learning driven multiple graph sampling
Ruiqi YU1(),Yuhua LIU1(),Xilong SHEN2,Ruyu ZHAI1,Xiang ZHANG2,Zhiguang ZHOU1
1.School of Media and Design,Hangzhou Dianzi University,Hangzhou 310018,China
2.School of Information Management and Artificial Intelligence,Zhejiang University of Finance and Economics,Hangzhou 310018,China
Download: HTML( 1 )   PDF(2051KB)
Export: BibTeX | EndNote (RIS)      

Abstract  

The existing graph sampling techniques pay attention mainly to single graph sampling, focusing on how to preserve the specific topological features of a graph by sampling, such as node degree, cluster coef?cient, connectivity. With the improvement of data acquisition capability, multiple graphs, namely a set of nodes exhibits different relationships in different scenarios, have become quietly ubiquitous in the real world. To address this problem, a multiple graph sampling driven by representation learning is proposed. First, a graph representation learning method is designed to fuse the structural features of multiple graphs, through which the nodes are projected into a two-dimensional representation learning space. Then, considering node density and network connectivity, an improved the adaptive blue noise sampling algorithm is employed to select nodes from the representation learning space meanwhile simultaneously preserving the contextual structure features of multiple graphs. Furthermore, an interactive visual analytics system is developed allowing users to explore and analyze multiple graph sampling, and visually compare results with various sampling strategies. Case studies and the experimental results based on two real-world datasets have demonstrated the effectiveness of the proposed method in sampling multiple graphs.



Key wordsmultiple network graphs      graph sampling      visual analysis      evaluation     
Received: 30 January 2022      Published: 24 May 2022
CLC:  TP 391.41  
Corresponding Authors: Yuhua LIU     E-mail: yrq0121@hdu.edu.cn;liuyuhua@hdu.edu.cn
Cite this article:

Ruiqi YU,Yuhua LIU,Xilong SHEN,Ruyu ZHAI,Xiang ZHANG,Zhiguang ZHOU. Representation learning driven multiple graph sampling. Journal of Zhejiang University (Science Edition), 2022, 49(3): 271-279.

URL:

https://www.zjujournals.com/sci/EN/Y2022/V49/I3/271


表征学习驱动的多重网络图采样

已有的图采样方法侧重于单图采样,关注如何在一张图上通过采样保留其特定的拓扑结构特征。随着数据采集能力的提升,多重网络图在实际应用中越来越普遍,即相同的节点集在不同场景中具有不同的网络关系。针对传统图采样方法无法兼顾多重网络图结构特征的问题,提出了表征学习驱动的多重网络图采样算法。首先,设计融合多重网络图结构特征的图表征学习方法,将节点投影至二维的表征学习空间;其次,利用改进的自适应蓝噪声采样算法,考虑节点密度和网络连通性,从表征学习空间筛选节点,以保持其多重网络结构特征及图上下文结构特征。进而开发了一套多重网络图采样可视分析系统,支持用户交互式地探索多重网络图采样,并与已有采样算法进行对比。案例分析和评估实验证明了本文算法在多重网络图采样中的有效性。


关键词: 多重网络图,  图采样,  可视分析,  评估 
Fig.1 Flow chart of the multiple graph sampling
Fig.2 Optimization of the Poisson disks
Fig.3 Traversal across different Poisson disks for the retention of multiple graph connection.
Fig.4 Visual analytics system of multiple network sampling
分类指标说明
节点平均中介度(ABD)4衡量桥节点保留情况,值越大被保留的桥节点越多,表现越好
平均中心性(ACC)4衡量节点中心性保持情况,与原始图结果越一致越好
连接性平均最短路径长度(APL)1采样图和原始图的平均最短路径长度,越一致越好
连通性(CC)4统计连通分量的数量,值越大连通性越差
社区结构社区相似性(SC)1度量采样前后社区结构直方图的分布变化,值越大越好
社区结构稳定性(SSC)1使用互信息度量采样前后社区结构的变化,值越小越好
聚类局部聚类系数(LCC)1采样图和原始图的局部聚类系数,越一致越好
全局聚类系数(GCC)1采样图和原始图的全局聚类系数,越一致越好
Table 1 Evaluation metrics
采样算法ABD↑△ACC↓△APL↓CC↓SC↑SSC↓△LCC↓△GCC↓
Our0.046 9610.010 2212.297 1593.380 4710.777 9011.429 3520.005 1960.001 791
RN0.005 3180.285 9303.272 23035.159 0320.452 8751.517 4110.035 8910.003 874
RE0.037 7840.039 3122.441 6875.512 2650.684 8011.420 5470.019 4820.000 556
TIE0.036 4190.051 7112.493 2305.341 2040.682 7881.407 5310.035 1300.001 108
SRW0.030 4040.057 2992.601 0518.974 0270.626 0091.416 4230.018 5090.001 172
RJ0.007 1050.270 3223.252 55833.216 7030.443 5921.511 9810.035 8910.003 874
ISRW0.032 8460.044 7542.360 1065.980 3260.653 2821.405 5180.035 6560.000 194
Table 2 Evaluation results of different sampling algorithm on MOSS dataset
采样算法ABD↑△ACC↓△APL↓CC↓SC↑SSC↓△LCC↓△GCC↓
Our0.043 3510.083 7694.159 63510.173 9040.808 1341.428 0930.162 4310.022 517
RN0.016 1030.204 6424.346 19118.462 0140.675 3091.470 7930.212 9500.090 168
RE0.027 1930.158 6574.168 92013.559 3120.710 2731.453 1030.163 0880.050 793
TIE0.025 4150.176 5524.183 46714.560 2310.707 7071.454 8000.174 6520.012 149
SRW0.023 1100.163 1374.126 94713.170 0320.725 7361.452 1830.151 3350.026 816
RJ0.014 9130.201 2434.333 99118.131 7960.687 0251.471 1070.214 3030.065 831
ISRW0.025 3110.137 6404.149 39412.129 0120.684 7571.440 6370.095 2680.085 155
Table 3 Evaluation results of different sampling algorithm on CELE dataset
Fig.5 Comparison of histogram distributions for node closeness centrality (MOSS)
Fig.6 Comparison of sampling results of MOSS dataset
Fig.7 Comparison of sampling results of CELE dataset
[1]   ZHOU Z, SHI C, SHEN X, et al. Context-aware sampling of large networks via graph representation learning[J]. IEEE Transactions on Visualization and Computer Graphics, 2020, 27(2): 1709-1719. DOI:10.1109/TVCG.2020.3030440
doi: 10.1109/TVCG.2020.3030440
[2]   CHENG K. Sampling from large graphs with a reservoir[C]// Proceedings of the 2014 17th International Conference on Network-Based Information Systems. Los Alamitos: IEEE Computer Society Press, 2014: 347-354. DOI:10. 1109/NBiS.2014.25
doi: 10. 1109/NBiS.2014.25
[3]   MAGNANI M, ROSSI L. Formation of multiple networks[C]// International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction. Berlin/Heidelberg: Springer, 2013: 257-264. DOI:10.1007/978-3-642-37210-0_28
doi: 10.1007/978-3-642-37210-0_28
[4]   SUN R, ZHANG L, CHEN Z. A sample-based approximate ranking method for large graphs[C]// 2018 6th International Conference on Advanced Cloud and Big Data(CBD). Los Alamitos: IEEE Computer Society Press, 2018: 112-117. DOI:10. 1109/CBD.2018.00029
doi: 10. 1109/CBD.2018.00029
[5]   TÜRKOGLU D, TURK A. Edge-based wedge sampling to estimate triangle counts in very large graphs[C]// 2017 IEEE International Conference on Data Mining(ICDM). Los Alamitos: IEEE Computer Society Press, 2017: 455-464. DOI:10. 1109/ICDM.2017.55
doi: 10. 1109/ICDM.2017.55
[6]   YOON S H, KIM K N, HONG J, et al. A community-based sampling method using DPL for online social networks[J]. Information Sciences, 2015, 306: 53-69. DOI:10.1016/j.ins. 2015.02.014
doi: 10.1016/j.ins. 2015.02.014
[7]   HU J, HONG S H, CHEN J, et al. Connectivity-based spectral sampling for big complex network visualization[C]// International Conference on Complex Networks and Their Applications. Cham: Springer, 2020: 237-248. DOI:10.1007/978-3-030-65347-7_20
doi: 10.1007/978-3-030-65347-7_20
[8]   ZHOU Z, MENG L, TANG C, et al. Visual abstraction of large scale geospatial origin-destination movement data[J]. IEEE Transactions on Visualization and Computer Graphics, 2018, 25(1): 43-53. DOI:10.1109/TVCG.2018.2864503
doi: 10.1109/TVCG.2018.2864503
[9]   GUO D, ZHU X. Origin-destination flow data smoothing and mapping[J]. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 2043-2052. DOI:10.1109/TVCG.2014. 2346271
doi: 10.1109/TVCG.2014. 2346271
[10]   ZHAO Y, JIANG H, CHEN Q, et al. Preserving minority structures in graph sampling[J]. IEEE Transactions on Visualization and Computer Graphics, 2020, 27(2): 1698-1708. DOI:10.1109/TVCG.2020.3030428
doi: 10.1109/TVCG.2020.3030428
[11]   MAIYA A S, BERGER-WOLF T Y. Benefits of bias: Towards better characterization of network sampling[C]// Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2011: 105-113. DOI:10.1145/2020408. 2020431
doi: 10.1145/2020408. 2020431
[12]   NGUYEN Q H, HONG S H, EADES P, et al. Proxy graph: Visual quality metrics of big graph sampling[J]. IEEE Transactions on Visualization and Computer Graphics, 2017, 23(6): 1600-1611. DOI:10.1109/TVCG.2017.2674999
doi: 10.1109/TVCG.2017.2674999
[13]   WU Y, CAO N, ARCHAMBAULT D, et al. Evaluation of graph sampling: A visualization perspective[J]. IEEE Transactions on Visualization and Computer Graphics, 2016, 23(1): 401-410. DOI:10.1109/TVCG.2016.2598867
doi: 10.1109/TVCG.2016.2598867
[14]   ZHANG F Y, ZHANG S, WONG P C, et al. A visual evaluation study of graph sampling techniques[J]. Electronic Imaging, 2017, 2017(1): 110-117. DOI:10.2352/ISSN.2470-1173.2017.1.VDA-394
doi: 10.2352/ISSN.2470-1173.2017.1.VDA-394
[15]   ZHANG D, YIN J, ZHU X, et al. Network representation learning: A survey[J]. IEEE Transactions on Big Data, 2020, 6(1): 3-28. DOI:10.1109/TBDATA.2018.2850013 .
doi: 10.1109/TBDATA.2018.2850013
[16]   PEROZZI B, AL-RFOU R, SKIENA S. Deepwalk: online learning of social representations[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2014: 701-710. DOI:10.1145/2623330.2623732
doi: 10.1145/2623330.2623732
[17]   GROVER A, LESKOVEC J. Node2vec: Scalable feature learning for networks[C]// Proceedings of the 22th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2016: 855-864. DOI:10.1145/2939672.2939754
doi: 10.1145/2939672.2939754
[18]   WANG Y, CHEN L, JO J, et al. Joint t-SNE for comparable projections of multiple high-dimensional datasets[J]. IEEE Transactions on Visualization and Computer Graphics, 2022, 28(1): 623-632. DOI:10.1109/TVCG.2021.3114765 .
doi: 10.1109/TVCG.2021.3114765
[1] Yufan WANG,Qiang BAI,Hu SUN. Spatial-temporal evolution and driving factors of ecological vulnerability in Fenhe River Basin from 2000 to 2020[J]. Journal of Zhejiang University (Science Edition), 2023, 50(5): 607-618.
[2] Xiliang CHEN,Gang LI,Yue YU,Feng XU,Ying HE,Jiachen YANG. Perception of destination visual image and the ground-sky transition mechanism based on the perspective of unmanned aerial vehicle (UAV) in Tibet[J]. Journal of Zhejiang University (Science Edition), 2023, 50(2): 249-260.
[3] Ling ZHANG,Lifei ZHU. A study on the impacts of purchase restriction policy on Hangzhou housing prices: Based on regression discontinuity analysis[J]. Journal of Zhejiang University (Science Edition), 2022, 49(5): 623-632.
[4] . [J]. Journal of Zhejiang University (Science Edition), 2022, 49(4): 398-407.
[5] Dan XU,Jian ZHU,Yu CHEN,Jie GU,Jianfeng MA,Xiaojun ZHANG. Study on quality characteristics of crab meat products of Ovalipes punctatus by intelligent sensory analysis combined with traditional sensory evaluation[J]. Journal of Zhejiang University (Science Edition), 2022, 49(3): 336-343.
[6] WANG Feng, HUANG Xiaoli, WAN Long. Probabilistic linguistic multi-attribute group decision making method and its application to evaluation of citizens′ sense of gain in new smart city[J]. Journal of Zhejiang University (Science Edition), 2021, 48(5): 557-564.
[7] FENG Cuicui, YE Guanqiong, ZENG Jiangning, LIN Ying, ZHU Jing. Research of performance evaluation of marine ecological civilization construction in Zhejiang province[J]. Journal of Zhejiang University (Science Edition), 2021, 48(5): 584-591.
[8] HE Jianwei, DA Ling, YANG Jianchun. Study on the value evaluation of rural homestay operators′ affective labor[J]. Journal of Zhejiang University (Science Edition), 2021, 48(5): 629-641.
[9] CHEN Jing. The application of fuzzy mathematics method to the evaluation of tourist attractiveness[J]. Journal of Zhejiang University (Science Edition), 2021, 48(1): 118-123.
[10] ZHAO Zhe, ZHANG Tianye, HUANG Yanhao, ZHENG Wenting, CHEN Wei. Simulation-based visual analysis of power grid operation mode[J]. Journal of Zhejiang University (Science Edition), 2020, 47(1): 36-44.
[11] LIU Sheng, LI Wangming, FANG Yuan. A study on the evaluation system of metropolitan village pension value based on niche theory[J]. Journal of Zhejiang University (Science Edition), 2019, 46(4): 503-510.
[12] GE Dandong, XU Wei, GAO Ning. The renewal strategies and mechanisms of ancient town based on subject and object authenticity:Taking Qianyuan town of Zhejiang as an example[J]. Journal of Zhejiang University (Science Edition), 2018, 45(2): 251-260.
[13] ZHANG Faming, WANG Weiming. Multi-stage dynamic interactive group evaluation method based on multi-granularity uncertain linguistic information[J]. Journal of Zhejiang University (Science Edition), 2017, 44(6): 724-734.
[14] DONG Wenli, ZHU Lixiong, LI Wangming. Research on the assessment of historic city conservation planning implementation based on fuzzy Delphi method[J]. Journal of Zhejiang University (Science Edition), 2017, 44(6): 749-756.
[15] CHEN Yang, YUE Wenze, MA Renfeng. Review and prospect of research on coastal land in China[J]. Journal of Zhejiang University (Science Edition), 2017, 44(4): 385-396,428.