L2,1范数,一致图分解," /> L2,1范数,一致图分解,"/> L2,1 norm,consensus graph decomposition,"/> 基于稀疏一致图分解的鲁棒多视图聚类算法
Please wait a minute...
浙江大学学报(理学版)  2023, Vol. 50 Issue (5): 569-579    DOI: 10.3785/j.issn.1008-9497.2023.05.008
数学与计算机科学     
基于稀疏一致图分解的鲁棒多视图聚类算法
耿莉(),王长鹏()
长安大学 理学院,陕西 西安 710064
Robust multi-view clustering algorithm based on sparse consensus graph decomposition
Li GENG(),Changpeng WANG()
School of Science,Chang'an University,Xi'an 710064,China
 全文: PDF(1924 KB)   HTML( 2 )
摘要:

由于数据形式日益复杂,陆续涌现了大量多视图聚类算法。但现有方法存在计算复杂度较高、需要额外的后续处理步骤、构造的相似图非最优等缺点。基于此,首先提出一种基于稀疏一致图分解的单视图聚类算法,然后将其扩展为多视图聚类算法,考虑不同视图对最终结果的贡献不同,对每个视图分配适当的权重,同时利用L2.1范数,得到性能更优的一致图,在一致图基础上学习非负表示矩阵,经交替迭代得到聚类结果。最后在多个数据集上进行比较实验,验证了该算法的有效性。

关键词: 多视图聚类L2,1范数')" href="#">L2,1范数一致图分解    
Abstract:

Due to the increasing complexity of data form, multi-view clustering algorithms emerge one after another. The main disadvantages of existing methods include: the computational complexity of these methods is high; the final clustering involves additional processing steps; the similarity graph constructed may not be the optimal graph. In order to solve the above problems, a clustering algorithm based on sparse consensus graph decomposition is proposed. The algorithm is first tested on single-view data, and then extends from single-view data to multi-view data. The algorithm takes into account different contributions of different views to the final result by giving each view appropriate weight, at the same time, makes use of the L2,1 norm to obtain the consensus graph with better performance, learns the non-negative representation matrix on the basis of the consensus graph, and reveals the cluster result directly after alternation iteration. Finally, an update iterative algorithm is proposed and tested on a large number of data sets to verify the effectiveness of the algorithm.

Key words: multi-view clustering    L2,1 norm')" href="#">L2,1 norm    consensus graph decomposition
收稿日期: 2022-04-18 出版日期: 2023-09-16
CLC:  TP 181  
基金资助: 国家自然科学基金青年项目(12001057);长安大学中央高校基本科研业务费专项资金资助项目(300102122101);陕西省重点产业创新链项目(2020ZDLGY09-09);陕西省自然科学基础研究计划项目(2020JQ-346)
通讯作者: 王长鹏     E-mail: 1069159798@qq.com;cpwang@chd.edu.cn
作者简介: 耿莉(1998—),ORCID:https://orcid.org/0000-0002-8051-5236,女,硕士研究生,主要从事机器学习研究,E-mail:1069159798@qq.com.
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
耿莉
王长鹏

引用本文:

耿莉,王长鹏. 基于稀疏一致图分解的鲁棒多视图聚类算法[J]. 浙江大学学报(理学版), 2023, 50(5): 569-579.

Li GENG,Changpeng WANG. Robust multi-view clustering algorithm based on sparse consensus graph decomposition. Journal of Zhejiang University (Science Edition), 2023, 50(5): 569-579.

链接本文:

https://www.zjujournals.com/sci/CN/10.3785/j.issn.1008-9497.2023.05.008        https://www.zjujournals.com/sci/CN/Y2023/V50/I5/569

符号说明符号说明
Gvv个视图的图矩阵I单位矩阵
S一致图相似矩阵n数据样本数
D度矩阵nv视图总数
L拉普拉斯矩阵dvv个视图的特征维数
P非负矩阵c聚类数
Q正交矩阵wvv个视图的权重
E辅助变量
表1  符号说明
图1  SCGFm流程
数据集样本数特征数聚类数
Wdbc569302
Abalone2 28284
K1a2 3401 3266
表2  单视图数据集描述
算法ACCPurity
Wdbc数据集Abalone数据集K1a数据集Wdbc数据集Abalone数据集K1a数据集
NMF0.848 20.375 40.308 80.848 20.380 30.620 8
PCA0.854 10.358 90.291 00.854 10.369 40.624 8
LLNMF0.597 90.352 80.270 90.655 40.357 90.616 3
GNMF0.853 20.360 30.304 90.853 20.370 80.619 4
RMNMF0.689 20.319 60.303 00.689 20.325 20.616 2
CFANs0.837 50.310 10.319 60.837 50.325 30.616 1
NMFAN0.854 10.388 10.321 70.854 10.389 40.625 1
SCGFs0.889 30.405 80.492 30.889 30.410 60.603 8
表3  单视图聚类性能比较
图2  SCGFs模型对参数λ的敏感性分析
数据集样本数视图数聚类数v个视图的特征维数dv
d1d2d3d4d5d6
BBCSport544253 1833 203
100leaves1 6003100646464
ORL4003404 0963 3046 750
Yale1653154 0963 3046 750
MSRCV1210671 30248512100256210
表4  多视图数据集描述
算法ACCPurityNMIFPrecisionRecallARI
GFSC0.699 90.711 90.504 70.600 30.491 20.806 50.429 0
AWP0.930 10.930 10.866 90.907 60.884 30.943 40.874 5
MCGC0.957 70.957 70.865 50.910 70.891 10.931 20.881 9
SwMC0.758 40.776 90.683 90.724 10.586 70.958 80.605 4
GSF0.553 30.577 20.433 80.540 60.377 50.951 90.302 3
MCNOGR_10.788 60.862 10.756 90.835 60.827 20.844 20.783 4
CDMGC0.735 30.759 20.693 30.714 40.565 90.968 30.591 3
SCGFm0.976 10.976 10.914 60.955 10.967 70.942 80.941 3
表5  在BBCSport数据集上的聚类性能比较
算法ACCPurityNMIFPrecisionRecallARI
GFSC0.064 30.115 20.334 40.066 80.034 90.798 60.049 7
AWP0.771 30.790 00.885 30.700 80.652 30.757 10.697 8
MCGC0.758 80.779 40.858 60.532 30.412 10.751 30.526 5
SwMC0.793 10.816 10.881 30.530 40.404 20.802 30.524 3
GSF0.895 60.903 10.946 10.817 70.767 90.874 40.815 8
MCNOGR_10.908 80.915 60.948 20.864 00.834 90.895 30.862 7
CDMGC0.886 90.902 00.948 30.794 60.710 90.910 50.792 3
SCGFm0.943 80.948 10.965 00.908 60.897 90.919 70.907 8
表6  不同算法在100leaves数据集上的聚类性能比较
算法ACCPurityNMIFPrecisionRecallARI
GFSC0.420 30.465 40.625 60.293 30.197 80.574 30.268 7
AWP0.632 50.657 50.788 50.524 10.462 90.603 90.511 6
MCGC0.675 00.737 50.821 20.469 20.339 50.758 90.452 1
SwMC0.711 10.772 60.838 70.456 00.322 60.811 80.437 5
GSF0.742 50.795 00.865 90.613 80.503 20.786 70.602 9
MCNOGR_10.815 00.855 00.903 80.741 30.687 70.803 90.734 8
CDMGC0.665 90.739 40.802 40.345 10.219 80.825 10.320 8
SCGFm0.850 00.885 00.922 40.800 70.767 30.837 20.795 9
表7  不同算法在ORL数据集上的聚类性能比较
算法ACCPurityNMIFPrecisionRecallARI
GFSC0.490 90.509 10.534 80.340 60.281 30.436 80.287 5
AWP0.503 00.509 10.591 10.407 50.366 20.459 40.364 4
MCGC0.600 00.618 20.640 20.448 50.403 90.504 20.408 5
SwMC0.651 50.654 50.670 40.452 60.389 30.543 60.410 6
GSF0.612 10.618 20.660 10.480 10.457 20.505 50.444 6
MCNOGR_10.624 20.642 40.656 90.467 30.463 10.471 50.432 3
CDMGC0.683 90.688 50.682 00.461 30.392 40.564 40.419 4
SCGFm0.703 00.703 00.706 60.537 90.514 40.563 60.506 4
表8  不同算法在Yale数据集上的聚类性能比较
算法ACCPurityNMIFPrecisionRecallARI
GFSC0.499 00.511 40.446 80.432 30.341 50.592 10.310 8
AWP0.690 50.695 20.632 10.622 90.512 30.794 40.546 4
MCGC0.781 00.785 70.697 50.689 00.614 80.783 60.631 7
SwMC0.723 80.756 20.709 60.641 60.556 50.770 40.570 6
GSF0.738 10.747 60.694 70.665 90.640 70.693 30.609 6
MCNOGR_10.738 10.795 20.691 00.657 70.632 20.685 40.600 0
CDMGC0.871 00.871 00.786 50.758 20.734 20.783 90.717 8
SCGFm0.881 00.881 00.791 50.783 30.770 00.797 00.747 7
表9  不同算法在MSRCV1数据集上的聚类性能比较
图3  SCGFm模型对参数λ和μ的敏感性分析
图4  SCGFm模型在多视图数据集上的收敛曲线
1 林宙辰, 李欢, 方聪. 机器学习中的加速一阶优化算法[M]. 北京: 机械工业出版社, 2021. doi:10.1007/978-981-15-2910-8
LIN Z C, LI H, FANG C. Accelerated Optimization for Machine Learning First-Order Algorithms[M]. Beijing: China Machine Press, 2021. doi:10.1007/978-981-15-2910-8
doi: 10.1007/978-981-15-2910-8
2 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
ZHOU Z H. Machine Learning[M]. Beijing: Tsinghua University Press, 2016.
3 NIE F P, LI J, LI X L. Self-weighted multiview clustering with multiple graphs[C]// Twenty-Sixth International Joint Conference on Artificial Intelligence. Melbourne: AAAI Press, 2017: 2564-2570. doi:10.24963/ijcai.2017/357
doi: 10.24963/ijcai.2017/357
4 WANG H, YANG Y, LIU B, et al. A study of graph-based system for multi-view clustering[J]. Knowledge-Based Systems, 2019, 163: 1009-1019. DOI:10.1016/j.knosys.2018.10.022
doi: 10.1016/j.knosys.2018.10.022
5 ZHAN K, NIU C X, CHEN C L, et al. Graph structure fusion for multiview clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(10): 1984-1993. DOI:10.1109/TKDE. 2018.2872061
doi: 10.1109/TKDE. 2018.2872061
6 ZHAN K, NIE F P, WANG J, et al. Multiview consensus graph clustering[J]. IEEE Transactions on Image Processing, 2018, 28(3): 1261-1270. DOI:10.1109/TIP.2018.2877335
doi: 10.1109/TIP.2018.2877335
7 HUANG S D, TSANG I W, XU Z L, et al. Measuring diversity in graph learning: A unified framework for structured multi-view clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(12): 5869-5883. DOI:10. 1109/tkde.2021.3068461
doi: 10. 1109/tkde.2021.3068461
8 WU D Y, NIE F P, DONG X, et al. Parameter-free consensus embedding learning for multiview graph-based clustering[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(12): 7944-7950. DOI:10.1109/TNNLS.2021.3087162
doi: 10.1109/TNNLS.2021.3087162
9 MA X L, YAN X M, LIU J F, et al. Simultaneous multi-graph learning and clustering for multiview data[J]. Information Sciences, 2022, 593: 472-487. DOI:10.1016/j.ins.2022.02.018
doi: 10.1016/j.ins.2022.02.018
10 WANG Q, LIU R, CHEN M L, et al. Robust rank-constrained sparse learning: A graph-based framework for single view and multiview clustering[J]. IEEE Transactions on Cybernetics, 2022, 52(10): 10228-10239. DOI:10.1109/TCYB.2021.3067137
doi: 10.1109/TCYB.2021.3067137
11 HUANG S D, XU Z L, KANG Z, et al. Regularized nonnegative matrix factorization with adaptive local structure learning[J]. Neurocomputing, 2020, 382: 196-209. DOI:10.1016/j.neucom.2019.11.070
doi: 10.1016/j.neucom.2019.11.070
12 WANG J, TIAN F, YU H C, et al. Diverse non-negative matrix factorization for multiview data representation[J]. IEEE Transactions on Cybernetics, 2017, 48(9): 2620-2632. DOI:10.1109/tcyb.2017. 2747400
doi: 10.1109/tcyb.2017. 2747400
13 LIANG N Y, YANG Z Y, LI Z N, et al. Multi-view clustering by non-negative matrix factorization with co-orthogonal constraints[J]. Knowledge-Based Systems, 2020, 194: 105582. DOI:10.1016/j.knosys.2020.105582
doi: 10.1016/j.knosys.2020.105582
14 YANG Z Y, LIANG N Y, YAN W, et al. Uniform distribution non-negative matrix factorization for multiview clustering[J]. IEEE Transactions on Cybernetics, 2020, 51(6): 3249-3262. DOI:10. 1109/tcyb.2020.2984552
doi: 10. 1109/tcyb.2020.2984552
15 ZHAO L, YANG T, ZHANG J, et al. Co-learning non-negative correlated and uncorrelated features for multi-view data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(4): 1486-1496. DOI:10.1109/tnnls.2020.2984810
doi: 10.1109/tnnls.2020.2984810
16 SHI S J, NIE F P, WANG R, et al. Multi-view clustering via nonnegative and orthogonal graph reconstruction[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(1): 201-214. DOI:10.1109/TNNLS.2021.3093297
doi: 10.1109/TNNLS.2021.3093297
17 POWELL M J D. A method for nonlinear constraints in minimization problems[J]. Optimization, 1969: 283-298.
18 HU Z X, NIE F P, WANG R, et al. Multi-view spectral clustering via integrating nonnegative embedding and spectral embedding[J]. Information Fusion, 2020, 55: 251-259. DOI:10.1016/j.inffus. 2019.09.005
doi: 10.1016/j.inffus. 2019.09.005
19 NIE F P, WANG X Q, JORDAN M, et al. The constrained Laplacian rank algorithm for graph-based clustering[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. Phoenix, Arizona: AAAI Press, 2016: 1969-1976. doi:10.1609/aaai.v30i1.10302
doi: 10.1609/aaai.v30i1.10302
20 NIE F P, HUANG H, CAI X, et al. Efficient and robust feature selection via joint ℓ2, 1-norms minimization[C]// Proceedings of the 23th International Conference on Neural Information Processing Systems, New York: Curran Associates Inc, 2010:1813-1821. DOI:10.5555/2997046. 2997098
doi: 10.5555/2997046. 2997098
21 YUAN M, LIN Y. Model selection and estimation in regression with grouped variables[J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2006, 68(1): 49-67. DOI:10.1111/j.1467-9868.2005.00532.x
doi: 10.1111/j.1467-9868.2005.00532.x
22 LI Y Q, NIE F P, HUANG H, et al. Large-scale multi-view spectral clustering via bipartite graph[C]// Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. Austin: AAAI Press, 2015: 2750-2756. doi:10.1609/aaai.v29i1.9598
doi: 10.1609/aaai.v29i1.9598
23 WANG Q, CHEN M L, NIE F P, et al. Detecting coherent groups in crowd scenes by multiview clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 42(1): 46-58. DOI:10.1109/tpami.2018.2875002
doi: 10.1109/tpami.2018.2875002
24 HUANG J, NIE F P, HUANG H. A new simplex sparse learning model to measure data similarity for clustering[C]// Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence. Buenos Aires: AAAI Press, 2015: 3569-3575.
25 ZHU H B, ZHOU M C. Efficient role transfer based on Kuhn-Munkres algorithm[J]. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 2011, 42(2): 491-496. DOI:10.1109/TSMCA.2011.2159587
doi: 10.1109/TSMCA.2011.2159587
26 SEUNG D, LEE L. Algorithms for non-negative matrix factorization[J]. Advances in Neural Information Processing Systems, 2001, 13: 556-562.
27 ABDI H, WILLIAMS L J. Principal component analysis[J]. Wiley Interdisciplinary Reviews: Computational Statistics, 2010, 2(4): 433-459. doi:10.1002/wics.101
doi: 10.1002/wics.101
28 GU Q Q, ZHOU J. Local learning regularized nonnegative matrix factorization[C]// Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers Inc, 2009: 1046-1051.
29 CAI D, HE X F, HAN J W, et al. Graph regularized nonnegative matrix factorization for data representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1548-1560. DOI:10.1109/TPAMI.2010.231
doi: 10.1109/TPAMI.2010.231
30 HUANG J, NIE F P, HUANG H, et al. Robust manifold nonnegative matrix factorization[J]. ACM Transactions on Knowledge Discovery from Data, 2014, 8(3): 1-21. DOI:10.1145/2601434
doi: 10.1145/2601434
31 PEI X B, CHEN C B, GONG W H. Concept factorization with adaptive neighbors for document clustering[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 29(2): 343-352. DOI:10.1109/TNNLS.2016.2626311
doi: 10.1109/TNNLS.2016.2626311
32 KANG Z, SHI G X, HUANG S D, et al. Multi-graph fusion for multi-view spectral clustering[J]. Knowledge-Based Systems, 2020, 189: 105102. DOI:10.1016/j.knosys.2019.105102
doi: 10.1016/j.knosys.2019.105102
[1] 胡东滨, 冯婧瑜, 杨艺, 易国栋. 考虑处置效果的苯系物泄露应急方案生成方法[J]. 浙江大学学报(理学版), 2022, 49(4): 457-466.
[2] 刘华玲, 恽文婧, 林蓓, 丁宇杰. 网络广告点击率预估的特征学习及技术研究进展[J]. 浙江大学学报(理学版), 2019, 46(5): 565-573.