Please wait a minute...
浙江大学学报(工学版)  2019, Vol. 53 Issue (3): 541-547    DOI: 10.3785/j.issn.1008-973X.2019.03.015
计算机技术     
基于重要性抽样的图卷积社团发现方法
蔡晓东(),王萌,梁晓曦,陈昀
桂林电子科技大学 信息与通信学院,广西 桂林 541004
Community detection method based on graph convolutional network via importance sampling
Xiao-dong CAI(),Meng WANG,Xiao-xi LIANG,Yun CHEN
School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
 全文: PDF(793 KB)   HTML
摘要:

根据图论将复杂网络转化为图结构数据,使卷积神经网络能够高效方便地处理;通过将图像上的卷积操作延伸到图结构数据上来定义卷积核,并通过卷积层对图的粗粒化和池化操作,提取不规则数据复杂网络的特征. 在采用随机梯度下降法训练网络时,设计一种重要性抽样方法改变样本的分布来缩减方差,从而节省梯度计算时间. 实验结果表明,与现有的图卷积网络相比,该方法在社会网络、引文网络、知识图谱数据集中,均能够以较低的计算复杂度获得较好的社团发现准确率;而且能够减少计算时的内存占用,可扩展到更大规模的复杂网络中使用.

关键词: 图卷积粗粒化社团发现重要性抽样随机梯度下降    
Abstract:

The complex networks were transformed into graph structure data according to the graph theory, so that the convolutional neural networks could be processed efficiently and conveniently. The future of complex networks was extracted by coarsening and pooling for the convolutional layer. Kernels were defined by extending the convolutional operation to images on graph structure data when training the network with stochastic gradient descent. In addition, importance sampling method was designed to change distribution of samples to reduce variance when networks were trained. The experimental results show that, compared with the existing graph convolutional networks, the proposed techniques can obtain better accuracy with lower computational complexity for community detection on the data sets of social networks, citation networks and knowledge graphs. The memory occupation for calculation is greatly reduced and the framework also can be applied to larger complex networks.

Key words: graph convolution    coarse graining    community detection    importance sampling    stochastic gradient descent
收稿日期: 2018-04-13 出版日期: 2019-03-04
CLC:  TP 39  
作者简介: 蔡晓东(1971—),男,教授,从事图像处理研究. orcid.org/0000-0001-8505-1007. E-mail: caixiaodong@guet.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
蔡晓东
王萌
梁晓曦
陈昀

引用本文:

蔡晓东,王萌,梁晓曦,陈昀. 基于重要性抽样的图卷积社团发现方法[J]. 浙江大学学报(工学版), 2019, 53(3): 541-547.

Xiao-dong CAI,Meng WANG,Xiao-xi LIANG,Yun CHEN. Community detection method based on graph convolutional network via importance sampling. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 541-547.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2019.03.015        http://www.zjujournals.com/eng/CN/Y2019/V53/I3/541

图 1  图的粗粒化和池化
图 2  图卷积网络社团发现方法基本框架
数据集 节点 特征 社团
Karate 34 78 0 4
Footballs 115 616 0 12
Cora 2 708 5 429 1 433 7
Citeseer 3 327 4 732 3 703 6
Pubmed 19 717 443 38 500 3
NELL 65 755 266 144 5 414 210
Reddit 232 965 11 606 919 602 41
表 1  本研究采用的数据集统计信息
数据集 训练集 验证集 测试集
Karate 8 8 18
Footballs 65 20 30
Cora 140 500 1 000
Citeseer 120 500 1 000
Pubmed 60 500 1 000
NELL 657 500 1 000
Reddit 152 410 23 699 55 334
表 2  实验训练集、验证集和测试集节点数设置
图 3  不同网络深度社团检测的准确性
%
数据集 文献[14] 文献[15] 本文方法
Karate ? ? 88.9
Footballs ? ? 93.3
Cora 75.7 81.5 81.7
Citeseer 64.7 70.3 73.1
Pubmed 77.2 79.0 82.3
NELL 61.9 66.0 73.6
Reddit ? ? 93.0
表 3  相同网络层数下采用不同方法时不同数据集的准确性对比
NE NT Racc/%
5 35 61.2
10 70 73.2
20 140 81.7
40 280 84.0
60 420 83.7
80 560 84.6
100 700 84.8
表 4  训练样本(节点数)对测试结果准确性的影响
数据集 文献[14] 文献[15] 本文方法
Karate ? ? 0.08
Footballs ? ? 0.06
Cora 13 4 3
Citeseer 26 7 5
Pubmed 25 38 40
NELL 185 48 32
Reddit ? ? 68
表 5  采用不同方法时不同数据集的计算耗时对比
图 4  不同网络数据集训练时的内存占用空间
1 SUN Y, WANG X, TANG X. Deeply learned face representations are sparse, selective, and robust [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 2892–2900.
2 XUE S, YAN Z. Improving latency-controlled BLSTM acoustic models for online speech recognition [C] // Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. New Orleans: IEEE, 2017: 5340–5344.
3 奚雪峰, 周国栋 面向自然语言处理的深度学习研究[J]. 自动化学报, 2016, 42 (10): 1445- 1465
XI Xue-Feng, ZHOU Guo-Dong A survey on deep learning for natural language processing[J]. Acta Automatica Sinica, 2016, 42 (10): 1445- 1465
4 FEDERICO M, DAVIDE B, JONATHAN M, et al. Geometric deep learning on graphs and manifolds using mixture model CNNs [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5425–5434.
5 MICHAEL M. B, JOAN B, YANN L, et al. Geometric deep learning: going beyond euclidean data[J]. IEEE Signal Processing Magazine, 2017, 34 (4): 18- 42
doi: 10.1109/MSP.2017.2693418
6 彭静, 廖乐健, 翟英, 等 谱聚类在社团发现中的应用[J]. 北京理工大学学报, 2016, 36 (7): 701- 705
PENG Jing, LIAO Le-jian, ZHAI Ying, et al Spectral clustering for community detection[J]. Transactions of Beijing Institute of Technology, 2016, 36 (7): 701- 705
7 SHIGA M, TAKIGAWA I, MAMITSUKA H A spectral approach to clustering numerical vectors as nodes in a networks[J]. Pattern Recognition, 2011, 44 (2): 236- 251
doi: 10.1016/j.patcog.2010.08.010
8 RAGHAVAN U N, ALBERT R, KUMARA S Near linear-time algorithm to detect community structures in large-scale networks[J]. Physical Review E, 2007, 76 (3): 036106
doi: 10.1103/PhysRevE.76.036106
9 WANG M, CAI X, ZENG Y, et al. A Community detection algorithm based on jaccard similarity label propagation [C] // Intelligent Data Engineering and Automated Learning. Guilin: IDEAL, 2017: 45–52.
10 刘大有, 金弟, 何东晓, 等 复杂网络社区挖掘综述[J]. 计算机研究与发展, 2013, 50 (10): 2140- 2154
LIU Da-you, JIN Di, HE Dong-xiao, et al Community ming in complex networks[J]. Journal of Computer Research and Development, 2013, 50 (10): 2140- 2154
doi: 10.7544/issn1000-1239.2013.20120357
11 DUCH J, ARENAS A Community detection in complex networks using extremal optimization[J]. Physical Review E, 2005, 72 (2): 027104
doi: 10.1103/PhysRevE.72.027104
12 JOAN B, LI X. Community Detection with graph neural networks. [27 May 2017]. arXiv: 1705.08415v2 [stat.ML].
13 DUVENAUD D K, MACLAURIN D, IPARRAGUIRRE J , et al. Convolutional networks on graphs for learning molecular fingerprints [C] // Advances in Neural Information Processing Systems. Montreal: NIPS, 2015: 2224–2232.
14 YANG Z, WILLIAM C, RUSLAN S. Revisiting semi-supervised learning with graph embeddings [C] // Proceedings of the 33nd International Conference on Machine Learning. New York: ICML, 2016: 40–48.
15 KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks [C] // 5th International Conference on Learning Representations. Toulon: ICLR, 2016. arXiv: 1609.02907.
16 LI R, WANG S, ZHU F, et al. Adaptive graph convolutional neural networks [C] // The Thirty-Second AAAI Conference on Artificial Intelligence. Louisiana: AAAI, 2018. arXiv: 1801.03226.
17 DHILLON I S, GUAN Y, KULIS B Weighted graph cuts without eigenvectors a multilevel approach[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29 (11): 1944- 57
doi: 10.1109/TPAMI.2007.1115
18 Mark Newman network data [DS]. [2017-12-05]. http://www-personal.umich.edu//~mejn/netdata.
19 HAMILTON W L, YING R, LESKOVEC J. Inductive Representation Learning on Large Graphs [C] // The Thirty-first Annual Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1025–1035.
20 KINGMA D P, BA J. Adam: a method for stochastic optimization [C] // In International Conference on Learning Representations. San Diego: ICLR, 2015.
[1] 闫旭,范晓亮,郑传潘,臧彧,王程,程明,陈龙彪. 基于图卷积神经网络的城市交通态势预测算法[J]. 浙江大学学报(工学版), 2020, 54(6): 1147-1155.
[2] 陈思,蔡晓东,侯珍珍,李波. 基于非均匀邻居节点采样的聚合式图嵌入方法[J]. 浙江大学学报(工学版), 2019, 53(11): 2163-2167.