高阶互信息最大化与伪标签指导的深度聚类

doi:10.3785/j.issn.1008-973X.2023.02.010

浙江大学学报(工学版)

2023, Vol. 57

Issue (2): 299-309 DOI: 10.3785/j.issn.1008-973X.2023.02.010

计算机技术

高阶互信息最大化与伪标签指导的深度聚类

刘超1(

),孔兵1,*(

),杜国王2,周丽华1,陈红梅1,包崇明3

1. 云南大学信息学院，云南昆明 650504
2. 云南大学西南天文研究所，云南昆明 650504
3. 云南大学软件学院，云南昆明 650504

Deep clustering via high-order mutual information maximization and pseudo-label guidance

Chao LIU1(

),Bing KONG1,*(

),Guo-wang DU2,Li-hua ZHOU1,Hong-mei CHEN1,Chong-ming BAO3

1. School of Information Science and Engineering, Yunnan University, Kunming 650504, China
2. South-Western Institute For Astronomy Research, Yunnan University, Kunming 650504, China
3. School of Software, Yunnan University, Kunming 650504, China

全文: PDF(2168 KB) HTML

摘要：

针对现有聚类方法未充分探索图的拓扑结构和节点关系，且无法受益于模型预测的不精确标签的问题，提出一种高阶互信息最大化与伪标签指导的深度聚类模型HMIPDC. 该模型采用高阶互信息最大化策略来最大化图的全局表示、节点表示、节点属性信息之间的互信息. 通过一种结合多跳邻近矩阵的自注意力机制更加合理地提取节点的低维表征. 使用基于深度散度的聚类损失函数（DDC）迭代优化聚类目标，抽取高置信度的预测标签对低维表征的学习进行监督. 在4个基准数据集上的聚类任务、实验时间分析和聚类可视化分析充分表明，HMIPDC的聚类性能始终优于大多数的深度聚类方法. 通过消融研究和参数敏感性分析验证了该模型的有效性和稳定性.

关键词： 自监督学习; 深度聚类; 自注意力机制; 高阶互信息; 伪标签

Abstract:

A high-order mutual information maximization and pseudo-label guided deep clustering model, HMIPDC, was proposed to solve the problem that the existing clustering methods cannot fully explore the topological structure and node relationships of the graph, and cannot benefit from inaccurate labels predicted by the model. The high-order mutual information maximization strategy was adopted to maximize the mutual information among the global representation of the graph, node representation, and node attribute information. Low-dimensional representations of nodes were extracted more reasonably through a self-attention mechanism combined with multi-hop proximity matrices. A deep divergence-based clustering loss function (DDC) was used to iteratively optimize the clustering objective while high confidence predicted labels were utilized to supervise the learning of low-dimensional representations. Experimental results of clustering tasks, experimental time analysis and clustering visualization analysis on four benchmark datasets show that the clustering performance of HMIPDC is always better than that of most deep clustering methods. The effectiveness and the stability of the model were also verified by ablation study and parameter sensitivity analysis.

Key words: self-supervised learning deep clustering self-attention mechanism high-order mutual information pseudo-label

收稿日期: 2022-07-30 出版日期: 2023-02-28

CLC:

TP 391

基金资助: 国家自然科学基金资助项目(62062066, 61762090, 31760152, 61966036, 62266050, 62276227); 2022年云南省基础研究计划重点项目(202201AS070015); 云南省中青年学术和技术带头人后备人才项目(202205AC160033)

通讯作者: 孔兵 E-mail: chaoliu@mail.ynu.edu.cn;kongbing@ynu.edu.cn

作者简介: 刘超（1996 —），男，硕士生，从事数据挖掘、深度聚类研究. orcid.org/0000-0001-5083-6744. E-mail: chaoliu@mail.ynu.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	刘超
	孔兵
	杜国王
	周丽华
	陈红梅
	包崇明

引用本文:

刘超,孔兵,杜国王,周丽华,陈红梅,包崇明. 高阶互信息最大化与伪标签指导的深度聚类[J]. 浙江大学学报(工学版), 2023, 57(2): 299-309.

Chao LIU,Bing KONG,Guo-wang DU,Li-hua ZHOU,Hong-mei CHEN,Chong-ming BAO. Deep clustering via high-order mutual information maximization and pseudo-label guidance. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 299-309.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.02.010 或 https://www.zjujournals.com/eng/CN/Y2023/V57/I2/299

图 1 HMIPDC 模型示意图

表 1 4个基准数据集的统计信息

%
数据集	方法	ACC	NMI	ARI	F1
ACM	K-means	67.31±0.71	32.44±0.46	30.60±0.69	67.57±0.74
	AE	81.83±0.08	49.30±0.16	59.64±0.16	82.01±0.08
	IDEC	85.12±0.52	56.61±1.16	62.16±1.50	85.11±0.48
	GAE	84.52±1.44	55.38±1.92	59.46±3.10	84.65±1.33
	DAEGC	88.24±0.02	63.01±0.04	68.70±0.04	88.11±0.01
	SDCN	90.45±0.18	68.31±0.25	73.91±0.40	90.42±0.19
	DFCN	90.90±0.20	64.40±0.40	74.90±0.40	90.80±0.20
	DCRN	91.93±0.20	71.56±0.61	77.56±0.52	91.94±0.20
	HMIPDC	92.12±0.15	72.18±0.52	78.04±0.39	92.13±0.14
Citeseer	K-means	39.32±3.17	16.94±3.22	13.43±3.02	36.08±3.53
	AE	57.08±0.13	27.64±0.08	29.31±0.14	53.80±0.11
	IDEC	60.49±1.42	27.17±2.40	25.70±2.65	61.62±1.39
	GAE	61.35±0.80	34.63±0.65	33.55±1.18	57.36±0.82
	DAEGC	64.90±0.07	38.71±0.08	39.21±0.09	59.56±0.06
	SDCN	65.96±0.31	38.71±0.32	40.17±0.43	63.62±0.24
	DFCN	69.50±0.20	43.90±0.20	45.50±0.30	64.30±0.20
	DCRN	70.86±0.18	45.86±0.35	47.64±0.30	65.83±0.21
	HMIPDC	71.93±0.31	46.07±0.23	48.28±0.50	66.96±0.26
DBLP	K-means	38.65±0.65	11.45±0.38	6.97±0.39	31.92±0.27
	AE	51.43±0.35	25.40±0.16	12.21±0.43	52.53±0.36
	IDEC	60.31±0.62	31.17±0.50	25.37±0.60	61.33±0.56
	GAE	61.21±1.22	30.80±0.91	22.02±1.40	61.41±2.23
	DAEGC	67.42±0.38	30.64±0.46	32.79±0.58	66.89±0.37
	SDCN	68.05±1.81	39.50±1.34	39.15±2.01	67.71±1.51
	DFCN	76.00±0.80	43.70±1.00	47.00±1.50	75.70±0.80
	DCRN	79.66±0.25	48.95±0.44	53.60±0.46	79.28±0.26
	HMIPDC	80.34±0.16	49.41±0.34	55.39±0.28	79.76±0.32
AMAP	K-means	27.22±0.76	13.23±1.33	5.50±0.44	23.96±0.51
	AE	48.25±0.08	38.76±0.30	20.80±0.47	47.87±0.20
	IDEC	47.62±0.08	37.83±0.08	19.24±0.07	47.20±0.11
	GAE	71.57±2.48	62.13±2.79	48.82±4.57	68.08±1.76
	DAEGC	75.52±0.01	63.31±0.01	59.98±0.84	70.02±0.01
	SDCN	53.44±0.81	44.85±0.83	31.21±1.23	50.66±1.49
	DFCN	76.88±0.80	69.21±1.00	59.98±0.84	71.58±0.31
	DCRN	79.94±0.13	73.70±0.24	63.69±0.20	73.82±0.12
	HMIPDC	80.83±0.78	69.47±0.46	65.23±1.64	75.87±2.08

表 2 HMIPDC和8种基线方法在4个数据集上的聚类结果

表 3 HMIPDC和3种变种算法在4个数据集上的聚类结果

图 2 不同超参数下4个数据集上的聚类准确率

图 3 不同训练时间下5种方法在ACM数据集上的聚类准确率

图 4 不同方法的低维表征在ACM数据集上的2维可视化结果

1	ZHAN Z H, LI J Y, ZHANG J Evolutionary deep learning: a survey[J]. Neurocomputing, 2022, 483: 42- 58 doi: 10.1016/j.neucom.2022.01.099
2	LIN Y J, GOU Y B, LIU Z T, et al. COMPLETER: incomplete multi-view clustering via contrastive prediction[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. [s. l. ]: CVPR, 2021: 11174-11183.
3	XIE J Y, GIRSHICK R, FARHADI A. Unsupervised deep embedding for clustering analysis[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: ICML, 2016: 478-487.
4	GUO X F, GAO L, LIU X W, et al. Improved deep embedded clustering with local structure preservation[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne: IJCAI, 2017: 1753-1759.
5	ZHANG S S, LIU J W, ZUO X, et al Online deep learning based on auto-encoder[J]. Applied Intelligence, 2021, 51 (8): 5420- 5439 doi: 10.1007/s10489-020-02058-8
6	WU Z H, PAN S R, CHEN F W, et al A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32 (1): 4- 24 doi: 10.1109/TNNLS.2020.2978386
7	KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[C]// 5th International Conference on Learning Representations. Toulon: ICLR, 2017: 1-14.
8	WANG C, PAN S R, HU R Q, et al. Attributed graph clustering: a deep attentional embedding approach[C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao: IJCAI, 2019: 3670-3676.
9	BO D Y, WANG X, SHI C, et al. Structural deep clustering network[C]// Proceedings of the Web Conference 2020. Taipei: WWW, 2020: 1400-1410.
10	杜国王, 周丽华, 王丽珍, et al 基于两级权重的多视角聚类[J]. 计算机研究与发展, 2022, 59 (4): 907- 921 DU Guo-wang, ZHOU Li-hua, WANG Li-zhen, et al Multi-view clustering based on two-level weights[J]. Journal of Computer Research and Development, 2022, 59 (4): 907- 921 doi: 10.7544/issn1000-1239.20200897
11	KAMPFFMEYER M, LØKSE S, BIANCHI F M, et al Deep divergence-based approach to clustering[J]. Neural Networks, 2019, 113: 91- 101 doi: 10.1016/j.neunet.2019.01.015
12	MOLAEI S, BOUSEJIN N G, ZARE H, et al Deep node clustering based on mutual information maximization[J]. Neurocomputing, 2021, 455: 274- 282 doi: 10.1016/j.neucom.2021.03.020
13	陈亦琦, 钱铁云, 李万理, et al 基于复合关系图卷积的属性网络嵌入方法[J]. 计算机研究与发展, 2020, 57 (8): 1674- 1682 CHEN Yi-qi, QIAN Tie-yun, LI Wan-li, et al Exploiting composite relation graph convolution for attributed network embedding[J]. Journal of Computer Research and Development, 2020, 57 (8): 1674- 1682 doi: 10.7544/issn1000-1239.2020.20200206
14	KIPF T N, WELLING M. Variational graph auto-encoders[C]// Bayesian Deep Learning Workshop on 30th Conference on Neural Information Processing Systems. Barcelona: NIPS, 2016.
15	KOU S W, XIA W, ZHANG X D, et al Self-supervised graph convolutional clustering by preserving latent distribution[J]. Neurocomputing, 2021, 437: 218- 226 doi: 10.1016/j.neucom.2021.01.082
16	TU W X, ZHOU S H, LIU X W, et al. Deep fusion clustering network[C]// 35th AAAI Conference on Artificial Intelligence. [s.l.]: AAAI, 2021: 9978-9987.
17	LIU Y, TU W X, ZHOU S H, et al. Deep graph clustering via dual correlation reduction[C]// 36th Conference on Artificial Intelligence. Vancouver: AAAI, 2022: 7603-7611.
18	BELGHAZI M I, BARATIN A, RAJESWAR S, et al. MINE: mutual information neural estimation[C]// Proceedings of the 35th International Conference on Machine Learning. Stockholm: ICML, 2018: 531-540.
19	HJELM R D, FEDOROV A, LAVOIE-MARCHILDON S, et al. Learning deep representations by mutual information estimation and maximization[C]// 7th International Conference on Learning Representations. New Orleans: ICLR, 2019.
20	VELIČKOVIĆ P, FEDUS W, HAMILTON W L, et al. Deep graph infomax[C]// 7th International Conference on Learning Representations. New Orleans: ICLR, 2019.
21	JING B Y, PARK C Y, TONG H H. HDMI: high-order deep multiplex infomax[C]// Proceedings of the Web Conference 2021. New York: WWW, 2021: 2414-2424.
22	MCGILL W J Multivariate information transmission[J]. Transactions of the IRE Professional Group on Information Theory, 1954, 4 (4): 93- 111 doi: 10.1109/TIT.1954.1057469
23	VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[C]// 5th International Conference on Learning Representations. Toulon: ICLR, 2017.
24	RIZVE M N, DUARTE K, RAWAT Y S, et al. In defense of pseudo-labeling: an uncertainty-aware pseudo-label selection framework for semi-supervised learning[C]// 9th International Conference on Learning Representations. [s. l. ]: ICLR, 2021.
25	HARTIGAN J A, WONG M A A K-means clustering algorithm[J]. Journal of the Royal Statistical Society Series C Applied Statistics, 1979, 28 (1): 100- 108
26	ZHAO H, YANG X, WANG Z R, et al. Graph debiased contrastive learning with joint representation clustering[C]// Proceedings of the 30th International Joint Conference on Artificial Intelligence. [s.l.]: IJCAI, 2021: 3434-3440.
27	LV J C, KANG Z, LU X, et al Pseudo-supervised deep subspace clustering[J]. IEEE Transactions on Image Processing, 2021, 30: 5252- 5263 doi: 10.1109/TIP.2021.3079800
28	BOUYER A, ROGHANI H LSMD: a fast and robust local community detection starting from low degree nodes in social networks[J]. Future Generation Computer Systems, 2020, 113: 41- 57 doi: 10.1016/j.future.2020.07.011
29	KINGMA D P, BA J. Adam: a method for stochastic optimization[C]// 3rd International Conference for Learning Representations. San Diego: ICLR, 2015.
30	MAATENL V D, HINTON G Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9: 2579- 2605
31	周丽华, 王家龙, 王丽珍, 等异质信息网络表征学习综述[J]. 计算机学报, 2022, 45 (1): 160- 189 ZHOU Li-hua, WANG Jia-long, WANG Li-zhen, et al Heterogeneous information network representation learning: a survey[J]. Chinese Journal of Computers, 2022, 45 (1): 160- 189 doi: 10.11897/SP.J.1016.2022.00160

[1]	周天琪,杨艳,张继杰,殷少伟,郭增强. 基于无负样本损失和自适应增强的图对比学习[J]. 浙江大学学报(工学版), 2023, 57(2): 259-266.
[2]	鞠晓臣,赵欣欣,钱胜胜. 基于自注意力机制的桥梁螺栓检测算法[J]. 浙江大学学报(工学版), 2022, 56(5): 901-908.
[3]	刘英莉,吴瑞刚,么长慧,沈韬. 铝硅合金实体关系抽取数据集的构建方法[J]. 浙江大学学报(工学版), 2022, 56(2): 245-253.
[4]	于楠晶,范晓飚,邓天民,冒国韬. 基于多头自注意力的复杂背景船舶检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2392-2402.

Viewed

Full text

Abstract

Cited

Shared

Discussed