Please wait a minute...
浙江大学学报(工学版)  2019, Vol. 53 Issue (11): 2163-2167    DOI: 10.3785/j.issn.1008-973X.2019.11.014
计算机技术与控制工程     
基于非均匀邻居节点采样的聚合式图嵌入方法
陈思(),蔡晓东*(),侯珍珍,李波
桂林电子科技大学 信息与通信学院,广西 桂林 541004
Aggregate graph embedding method based on non-uniform neighbor nodes sampling
Si CHEN(),Xiao-dong CAI*(),Zhen-zhen HOU,Bo LI
School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
 全文: PDF(562 KB)   HTML
摘要:

针对已有聚合式图嵌入方法多采用均匀采样函数为图中节点构建邻域,即仅随机采样邻居节点,而忽略各邻居节点自身性质的差异的问题,提出基于度值的非均匀邻居节点采样方法. 针对目标节点,优先采样其度值较大的邻居节点;隐藏一批度值较小的邻居节点,使它们在采样过程中不出现;在邻居节点集中随机采样剩余的节点以保留一定的采样随机性,这些随机采样的节点与优先采样的节点组成目标节点的邻域. 将所提出的非均匀邻居节点采样方法应用于图嵌入过程,在Reddit数据集上的图嵌入分类F1分数为91.7%,该结果优于几个知名的图嵌入方法的结果. 在重叠社团数据集PPI上的实验证实提出方法能够为图数据生成更高质量的嵌入.

关键词: 图嵌入网络嵌入非均匀采样图卷积网络邻域聚合    
Abstract:

Aiming at the problem that the uniform sampling function is widely used in aggregate graph embedding methods to construct neighborhoods for nodes in a graph, i.e. neighbor nodes are sampled randomly, and the differences of their properties are neglected, a non-uniform neighbor nodes sampling method was proposed. The neighbor nodes of the target node with larger degrees are sampled preferentially. Some neighbor nodes with lower degrees are hidden so that they do not appear during the sampling process. The remaining nodes in the neighbor node set are randomly sampled to preserve sampling randomness, then these randomly sampled nodes and the preferentially sampled nodes form the neighborhood of the target node. The proposed non-uniform neighbor nodes sampling method was applied to the graph embedding process. Experimental results showed that the proposed method can be used to improve the classification F1 score of graph embedding to 91.7% on the Reddit dataset, and the results was superior than that of several known graph embedding methods. Experiments on the overlapping community dataset PPI also confirmed that the proposed method can be used to generate embedding with higher quality for graph data.

Key words: graph embedding    network embedding    non-uniform sampling    graph convolutional network    neighborhood aggregation
收稿日期: 2018-09-19 出版日期: 2019-11-21
CLC:  TP 183  
基金资助: 新疆自治区重点研发计划资助项目(2018B03022-1,2018B03022-2)
通讯作者: 蔡晓东     E-mail: 646299680@qq.com;caixiaodong@guet.edu.cn
作者简介: 陈思(1995—),女,硕士生,从事数据挖掘研究. orcid.org/0000-0002-4323-9519. E-mail: 646299680@qq.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
陈思
蔡晓东
侯珍珍
李波

引用本文:

陈思,蔡晓东,侯珍珍,李波. 基于非均匀邻居节点采样的聚合式图嵌入方法[J]. 浙江大学学报(工学版), 2019, 53(11): 2163-2167.

Si CHEN,Xiao-dong CAI,Zhen-zhen HOU,Bo LI. Aggregate graph embedding method based on non-uniform neighbor nodes sampling. Journal of ZheJiang University (Engineering Science), 2019, 53(11): 2163-2167.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2019.11.014        http://www.zjujournals.com/eng/CN/Y2019/V53/I11/2163

图 1  邻居节点采样及邻域聚合
图 2  图嵌入流程框图
方法 F1分数 方法 F1分数
Random 0.043 GraphSAGE 0.897
DeepWalk 0.324 NonUniform 0.917
RawFeatures 0.585
表 1  Reddit数据集的F1分数实验结果
方法 F1分数 方法 F1分数
Random 0.396 RawFeatures 0.422
GraphSAGE 0.447 NonUniform 0.458
表 2  PPI数据集的F1分数实验结果
数据集 F1分数
NonUniform NonUniform-1
Reddit 0.917 0.911
PPI 0.458 0.453
表 3  NonUniform与NonUniform-1方法的F1分数差别
1 HAMILTON W L, YING R, LESKOVEC J Representation learning on graphs: methods and applications[J]. IEEE Data Engineering Bulletin, 2017, 40 (3): 52- 74
2 GOYAL P, FERRARA E Graph embedding techniques, applications, and performance: a survey[J]. Knowledge Based Systems, 2017, 151: 78- 94
3 BELKIN M, NIYOGI P. Laplacian eigenmaps and spectral techniques for embedding and clustering [C] // Advances in Neural Information Processing Systems 14. Vancouver: NIPS, 2001: 585-591.
4 ROWEIS S T, SAUL L K Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, 290 (5500): 2323- 2326
doi: 10.1126/science.290.5500.2323
5 CAO S, LU W, XU Q. GraRep: learning graph representations with global structural information [C]// ACM International Conference on Information and Knowledge Management. Melbourne: ACM, 2015: 891-900.
6 PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: online learning of social representations [C]// ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 701-710.
7 GROVER A, LESKOVEC J. Node2vec: scalable feature learning for networks [C]// ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016: 855-864.
8 KIPFT N, WELLING M. Semi-supervised classification with graph convolutional networks [C]// International Conference on Learning Representations. San Juan: ICLR, 2016.
9 DEFFERRARD M, BRESSON X, VANDERGHEYNST P. Convolutional neural networks on graphs with fast localized spectral filtering [C]// Advances in Neural Information Processing Systems 29. Barcelona: NIPS, 2016: 3837-3845.
10 WANG D, CUI P, ZHU W. Structural deep network embedding [C]// ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016: 1225-1234.
11 HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs [C]// Advances in Neural Information Processing Systems 30. Long Beach: NIPS, 2017.
[1] 陈世达,刘强,韩亮. 降低分布式训练通信的梯度稀疏压缩方法[J]. 浙江大学学报(工学版), 2021, 55(2): 386-394.
[2] 刘创,梁军. 基于注意力机制的车辆运动轨迹预测[J]. 浙江大学学报(工学版), 2020, 54(6): 1156-1163.
[3] 杨萍,王丹,康子健,李童,付利华,余悦任. 基于模式识别和集成CNN-LSTM的阵发性房颤预测模型[J]. 浙江大学学报(工学版), 2020, 54(5): 1039-1048.
[4] 李浩,赵文杰,韩波. 基于滤波器裁剪的卷积神经网络加速算法[J]. 浙江大学学报(工学版), 2019, 53(10): 1994-2002.
[5] 崔黎黎, 张化光,罗艳红. 控制方向未知的非线性系统的自适应评价设计[J]. J4, 2012, 46(5): 853-857.
[6] 朱北斗,龚国芳,周如林,刘国斌. 基于盾构掘进参数的BP神经网络地层识别[J]. J4, 2011, 45(5): 851-857.
[7] 马玉良, 徐文良, 孟明, 罗志增, 杨家强. 基于神经网络的智能下肢假肢自适应控制[J]. J4, 2010, 44(7): 1373-1376.
[8] 赵胜颖, 高广春. 基于蚁群算法的选择性神经网络集成方法[J]. J4, 2009, 43(09): 1568-1573.
[9] 张建海, 张森林, 刘妹琴. 离散时滞标准神经网络模型的鲁棒稳定性分析[J]. J4, 2009, 43(8): 1383-1388.