Deep clustering via high-order mutual information maximization and pseudo-label guidance

doi:10.3785/j.issn.1008-973X.2023.02.010

Journal of ZheJiang University (Engineering Science)

2023, Vol. 57

Issue (2): 299-309 DOI: 10.3785/j.issn.1008-973X.2023.02.010

Deep clustering via high-order mutual information maximization and pseudo-label guidance

Chao LIU1(

),Bing KONG1,*(

),Guo-wang DU2,Li-hua ZHOU1,Hong-mei CHEN1,Chong-ming BAO3

1. School of Information Science and Engineering, Yunnan University, Kunming 650504, China
2. South-Western Institute For Astronomy Research, Yunnan University, Kunming 650504, China
3. School of Software, Yunnan University, Kunming 650504, China

Download:

HTML

PDF(2168KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A high-order mutual information maximization and pseudo-label guided deep clustering model, HMIPDC, was proposed to solve the problem that the existing clustering methods cannot fully explore the topological structure and node relationships of the graph, and cannot benefit from inaccurate labels predicted by the model. The high-order mutual information maximization strategy was adopted to maximize the mutual information among the global representation of the graph, node representation, and node attribute information. Low-dimensional representations of nodes were extracted more reasonably through a self-attention mechanism combined with multi-hop proximity matrices. A deep divergence-based clustering loss function (DDC) was used to iteratively optimize the clustering objective while high confidence predicted labels were utilized to supervise the learning of low-dimensional representations. Experimental results of clustering tasks, experimental time analysis and clustering visualization analysis on four benchmark datasets show that the clustering performance of HMIPDC is always better than that of most deep clustering methods. The effectiveness and the stability of the model were also verified by ablation study and parameter sensitivity analysis.

Key words： self-supervised learning deep clustering self-attention mechanism high-order mutual information pseudo-label

Received: 30 July 2022 Published: 28 February 2023

CLC:

TP 391

Fund: 国家自然科学基金资助项目(62062066, 61762090, 31760152, 61966036, 62266050, 62276227); 2022年云南省基础研究计划重点项目(202201AS070015); 云南省中青年学术和技术带头人后备人才项目(202205AC160033)

Corresponding Authors: Bing KONG E-mail: chaoliu@mail.ynu.edu.cn;kongbing@ynu.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Chao LIU
	Bing KONG
	Guo-wang DU
	Li-hua ZHOU
	Hong-mei CHEN
	Chong-ming BAO

Cite this article:

Chao LIU,Bing KONG,Guo-wang DU,Li-hua ZHOU,Hong-mei CHEN,Chong-ming BAO. Deep clustering via high-order mutual information maximization and pseudo-label guidance. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 299-309.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2023.02.010 OR https://www.zjujournals.com/eng/Y2023/V57/I2/299

高阶互信息最大化与伪标签指导的深度聚类

针对现有聚类方法未充分探索图的拓扑结构和节点关系，且无法受益于模型预测的不精确标签的问题，提出一种高阶互信息最大化与伪标签指导的深度聚类模型HMIPDC. 该模型采用高阶互信息最大化策略来最大化图的全局表示、节点表示、节点属性信息之间的互信息. 通过一种结合多跳邻近矩阵的自注意力机制更加合理地提取节点的低维表征. 使用基于深度散度的聚类损失函数（DDC）迭代优化聚类目标，抽取高置信度的预测标签对低维表征的学习进行监督. 在4个基准数据集上的聚类任务、实验时间分析和聚类可视化分析充分表明，HMIPDC的聚类性能始终优于大多数的深度聚类方法. 通过消融研究和参数敏感性分析验证了该模型的有效性和稳定性.

关键词： 自监督学习, 深度聚类, 自注意力机制, 高阶互信息, 伪标签

Fig.1 Schematic diagram of HMIPDC model

Tab.1 Statistics of four benchmark datasets

%
数据集	方法	ACC	NMI	ARI	F1
ACM	K-means	67.31±0.71	32.44±0.46	30.60±0.69	67.57±0.74
	AE	81.83±0.08	49.30±0.16	59.64±0.16	82.01±0.08
	IDEC	85.12±0.52	56.61±1.16	62.16±1.50	85.11±0.48
	GAE	84.52±1.44	55.38±1.92	59.46±3.10	84.65±1.33
	DAEGC	88.24±0.02	63.01±0.04	68.70±0.04	88.11±0.01
	SDCN	90.45±0.18	68.31±0.25	73.91±0.40	90.42±0.19
	DFCN	90.90±0.20	64.40±0.40	74.90±0.40	90.80±0.20
	DCRN	91.93±0.20	71.56±0.61	77.56±0.52	91.94±0.20
	HMIPDC	92.12±0.15	72.18±0.52	78.04±0.39	92.13±0.14
Citeseer	K-means	39.32±3.17	16.94±3.22	13.43±3.02	36.08±3.53
	AE	57.08±0.13	27.64±0.08	29.31±0.14	53.80±0.11
	IDEC	60.49±1.42	27.17±2.40	25.70±2.65	61.62±1.39
	GAE	61.35±0.80	34.63±0.65	33.55±1.18	57.36±0.82
	DAEGC	64.90±0.07	38.71±0.08	39.21±0.09	59.56±0.06
	SDCN	65.96±0.31	38.71±0.32	40.17±0.43	63.62±0.24
	DFCN	69.50±0.20	43.90±0.20	45.50±0.30	64.30±0.20
	DCRN	70.86±0.18	45.86±0.35	47.64±0.30	65.83±0.21
	HMIPDC	71.93±0.31	46.07±0.23	48.28±0.50	66.96±0.26
DBLP	K-means	38.65±0.65	11.45±0.38	6.97±0.39	31.92±0.27
	AE	51.43±0.35	25.40±0.16	12.21±0.43	52.53±0.36
	IDEC	60.31±0.62	31.17±0.50	25.37±0.60	61.33±0.56
	GAE	61.21±1.22	30.80±0.91	22.02±1.40	61.41±2.23
	DAEGC	67.42±0.38	30.64±0.46	32.79±0.58	66.89±0.37
	SDCN	68.05±1.81	39.50±1.34	39.15±2.01	67.71±1.51
	DFCN	76.00±0.80	43.70±1.00	47.00±1.50	75.70±0.80
	DCRN	79.66±0.25	48.95±0.44	53.60±0.46	79.28±0.26
	HMIPDC	80.34±0.16	49.41±0.34	55.39±0.28	79.76±0.32
AMAP	K-means	27.22±0.76	13.23±1.33	5.50±0.44	23.96±0.51
	AE	48.25±0.08	38.76±0.30	20.80±0.47	47.87±0.20
	IDEC	47.62±0.08	37.83±0.08	19.24±0.07	47.20±0.11
	GAE	71.57±2.48	62.13±2.79	48.82±4.57	68.08±1.76
	DAEGC	75.52±0.01	63.31±0.01	59.98±0.84	70.02±0.01
	SDCN	53.44±0.81	44.85±0.83	31.21±1.23	50.66±1.49
	DFCN	76.88±0.80	69.21±1.00	59.98±0.84	71.58±0.31
	DCRN	79.94±0.13	73.70±0.24	63.69±0.20	73.82±0.12
	HMIPDC	80.83±0.78	69.47±0.46	65.23±1.64	75.87±2.08

Tab.2 Clustering results of HMIPDC and eight baseline methods on four datasets

Tab.3 Clustering results of HMIPDC and three variants on four datasets

Fig.2 Clustering accuracy on four datasets with different hyperparameters

Fig.3 Clustering accuracy of five methods with different training time on ACM dataset

Fig.4 2D visualization results of low-dimensional representations of different methods on ACM dataset


[1]	ZHAN Z H, LI J Y, ZHANG J Evolutionary deep learning: a survey[J]. Neurocomputing, 2022, 483: 42- 58 doi: 10.1016/j.neucom.2022.01.099

[2]	LIN Y J, GOU Y B, LIU Z T, et al. COMPLETER: incomplete multi-view clustering via contrastive prediction[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. [s. l. ]: CVPR, 2021: 11174-11183.

[3]	XIE J Y, GIRSHICK R, FARHADI A. Unsupervised deep embedding for clustering analysis[C]// Proceedings of the 33rd International Conference on Machine Learning. New York: ICML, 2016: 478-487.

[4]	GUO X F, GAO L, LIU X W, et al. Improved deep embedded clustering with local structure preservation[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne: IJCAI, 2017: 1753-1759.

[5]	ZHANG S S, LIU J W, ZUO X, et al Online deep learning based on auto-encoder[J]. Applied Intelligence, 2021, 51 (8): 5420- 5439 doi: 10.1007/s10489-020-02058-8

[6]	WU Z H, PAN S R, CHEN F W, et al A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32 (1): 4- 24 doi: 10.1109/TNNLS.2020.2978386

[7]	KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[C]// 5th International Conference on Learning Representations. Toulon: ICLR, 2017: 1-14.

[8]	WANG C, PAN S R, HU R Q, et al. Attributed graph clustering: a deep attentional embedding approach[C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao: IJCAI, 2019: 3670-3676.

[9]	BO D Y, WANG X, SHI C, et al. Structural deep clustering network[C]// Proceedings of the Web Conference 2020. Taipei: WWW, 2020: 1400-1410.

[10]	杜国王, 周丽华, 王丽珍, et al 基于两级权重的多视角聚类[J]. 计算机研究与发展, 2022, 59 (4): 907- 921 DU Guo-wang, ZHOU Li-hua, WANG Li-zhen, et al Multi-view clustering based on two-level weights[J]. Journal of Computer Research and Development, 2022, 59 (4): 907- 921 doi: 10.7544/issn1000-1239.20200897

[11]	KAMPFFMEYER M, LØKSE S, BIANCHI F M, et al Deep divergence-based approach to clustering[J]. Neural Networks, 2019, 113: 91- 101 doi: 10.1016/j.neunet.2019.01.015

[12]	MOLAEI S, BOUSEJIN N G, ZARE H, et al Deep node clustering based on mutual information maximization[J]. Neurocomputing, 2021, 455: 274- 282 doi: 10.1016/j.neucom.2021.03.020

[13]	陈亦琦, 钱铁云, 李万理, et al 基于复合关系图卷积的属性网络嵌入方法[J]. 计算机研究与发展, 2020, 57 (8): 1674- 1682 CHEN Yi-qi, QIAN Tie-yun, LI Wan-li, et al Exploiting composite relation graph convolution for attributed network embedding[J]. Journal of Computer Research and Development, 2020, 57 (8): 1674- 1682 doi: 10.7544/issn1000-1239.2020.20200206

[14]	KIPF T N, WELLING M. Variational graph auto-encoders[C]// Bayesian Deep Learning Workshop on 30th Conference on Neural Information Processing Systems. Barcelona: NIPS, 2016.

[15]	KOU S W, XIA W, ZHANG X D, et al Self-supervised graph convolutional clustering by preserving latent distribution[J]. Neurocomputing, 2021, 437: 218- 226 doi: 10.1016/j.neucom.2021.01.082

[16]	TU W X, ZHOU S H, LIU X W, et al. Deep fusion clustering network[C]// 35th AAAI Conference on Artificial Intelligence. [s.l.]: AAAI, 2021: 9978-9987.

[17]	LIU Y, TU W X, ZHOU S H, et al. Deep graph clustering via dual correlation reduction[C]// 36th Conference on Artificial Intelligence. Vancouver: AAAI, 2022: 7603-7611.

[18]	BELGHAZI M I, BARATIN A, RAJESWAR S, et al. MINE: mutual information neural estimation[C]// Proceedings of the 35th International Conference on Machine Learning. Stockholm: ICML, 2018: 531-540.

[19]	HJELM R D, FEDOROV A, LAVOIE-MARCHILDON S, et al. Learning deep representations by mutual information estimation and maximization[C]// 7th International Conference on Learning Representations. New Orleans: ICLR, 2019.

[20]	VELIČKOVIĆ P, FEDUS W, HAMILTON W L, et al. Deep graph infomax[C]// 7th International Conference on Learning Representations. New Orleans: ICLR, 2019.

[21]	JING B Y, PARK C Y, TONG H H. HDMI: high-order deep multiplex infomax[C]// Proceedings of the Web Conference 2021. New York: WWW, 2021: 2414-2424.

[22]	MCGILL W J Multivariate information transmission[J]. Transactions of the IRE Professional Group on Information Theory, 1954, 4 (4): 93- 111 doi: 10.1109/TIT.1954.1057469

[23]	VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[C]// 5th International Conference on Learning Representations. Toulon: ICLR, 2017.

[24]	RIZVE M N, DUARTE K, RAWAT Y S, et al. In defense of pseudo-labeling: an uncertainty-aware pseudo-label selection framework for semi-supervised learning[C]// 9th International Conference on Learning Representations. [s. l. ]: ICLR, 2021.

[25]	HARTIGAN J A, WONG M A A K-means clustering algorithm[J]. Journal of the Royal Statistical Society Series C Applied Statistics, 1979, 28 (1): 100- 108

[26]	ZHAO H, YANG X, WANG Z R, et al. Graph debiased contrastive learning with joint representation clustering[C]// Proceedings of the 30th International Joint Conference on Artificial Intelligence. [s.l.]: IJCAI, 2021: 3434-3440.

[27]	LV J C, KANG Z, LU X, et al Pseudo-supervised deep subspace clustering[J]. IEEE Transactions on Image Processing, 2021, 30: 5252- 5263 doi: 10.1109/TIP.2021.3079800

[28]	BOUYER A, ROGHANI H LSMD: a fast and robust local community detection starting from low degree nodes in social networks[J]. Future Generation Computer Systems, 2020, 113: 41- 57 doi: 10.1016/j.future.2020.07.011

[29]	KINGMA D P, BA J. Adam: a method for stochastic optimization[C]// 3rd International Conference for Learning Representations. San Diego: ICLR, 2015.

[30]	MAATENL V D, HINTON G Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9: 2579- 2605

[31]	周丽华, 王家龙, 王丽珍, 等异质信息网络表征学习综述[J]. 计算机学报, 2022, 45 (1): 160- 189 ZHOU Li-hua, WANG Jia-long, WANG Li-zhen, et al Heterogeneous information network representation learning: a survey[J]. Chinese Journal of Computers, 2022, 45 (1): 160- 189 doi: 10.11897/SP.J.1016.2022.00160

[1]	Tian-qi ZHOU,Yan YANG,Ji-jie ZHANG,Shao-wei YIN,Zeng-qiang GUO. Graph contrastive learning based on negative-sample-free loss and adaptive augmentation[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 259-266.

[2]	Xiao-chen JU,Xin-xin ZHAO,Sheng-sheng QIAN. Self-attention mechanism based bridge bolt detection algorithm[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 901-908.

[3]	Ying-li LIU,Rui-gang WU,Chang-hui YAO,Tao SHEN. Construction method of extraction dataset of Al-Si alloy entity relationship[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 245-253.

[4]	Nan-jing YU,Xiao-biao FAN,Tian-min DENG,Guo-tao MAO. Ship detection algorithm in complex backgrounds via multi-head self-attention[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2392-2402.

Viewed

Full text

Abstract

Cited

Shared

Discussed