Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2018, Vol. 19 Issue (11): 1385-1396    
    
An anchor-based spectral clustering method
Qin ZHANG, Guo-qiang ZHONG , Jun-yu DONG
Department of Computer Science and Technology, Ocean University of China, Qingdao 266100, China
Science and Information College, Qingdao Agricultural University, Qingdao 266109, China
An anchor-based spectral clustering method
Qin ZHANG, Guo-qiang ZHONG , Jun-yu DONG
Department of Computer Science and Technology, Ocean University of China, Qingdao 266100, China
Science and Information College, Qingdao Agricultural University, Qingdao 266109, China
 全文: PDF 
摘要: Spectral clustering is one of the most popular and important clustering methods in pattern recognition,
machine learning, and data mining.  However, its high computational complexity limits it in applications involving
truly large-scale datasets. For a clustering problem with n samples, it needs to compute the eigenvectors of the graph
Laplacian with O(n
3
) time complexity.  To address this problem, we propose a novel method called anchor-based
spectral clustering (ASC) by employing anchor points of data.  Specifically, m (m  n) anchor points are selected
from  the dataset,  which  can basically  maintain  the intrinsic (manifold)  structure of  the original  data.   Then a
mapping matrix between the original data and the anchors is constructed.   More importantly, it is proved that
this data-anchor mapping matrix essentially preserves the clustering structure of the data.  Based on this mapping
matrix, it is easy to approximate the spectral embedding of the original data.  The proposed method scales linearly
relative to the size of the data but with low degradation of the clustering performance. The proposed method, ASC,
is compared to the classical spectral clustering and two state-of-the-art accelerating methods, i.e., power iteration
clustering and landmark-based spectral clustering,  on 10 real-world applications under three evaluation metrics.
Experimental results show that ASC is consistently faster than the classical spectral clustering with comparable
clustering performance, and at least comparable with or better than the state-of-the-art methods on both effectiveness
and efficiency.
关键词: Clustering Spectral clustering Graph Laplacian Anchors    
Abstract: Spectral clustering is one of the most popular and important clustering methods in pattern recognition,
machine learning, and data mining.  However, its high computational complexity limits it in applications involving
truly large-scale datasets. For a clustering problem with n samples, it needs to compute the eigenvectors of the graph
Laplacian with O(n
3
) time complexity.  To address this problem, we propose a novel method called anchor-based
spectral clustering (ASC) by employing anchor points of data.  Specifically, m (m  n) anchor points are selected
from  the dataset,  which  can basically  maintain  the intrinsic (manifold)  structure of  the original  data.   Then a
mapping matrix between the original data and the anchors is constructed.   More importantly, it is proved that
this data-anchor mapping matrix essentially preserves the clustering structure of the data.  Based on this mapping
matrix, it is easy to approximate the spectral embedding of the original data.  The proposed method scales linearly
relative to the size of the data but with low degradation of the clustering performance. The proposed method, ASC,
is compared to the classical spectral clustering and two state-of-the-art accelerating methods, i.e., power iteration
clustering and landmark-based spectral clustering,  on 10 real-world applications under three evaluation metrics.
Experimental results show that ASC is consistently faster than the classical spectral clustering with comparable
clustering performance, and at least comparable with or better than the state-of-the-art methods on both effectiveness
and efficiency.
Key words: Clustering    Spectral clustering    Graph Laplacian    Anchors
收稿日期: 2017-04-17 出版日期: 2019-06-13
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
Qin ZHANG
Guo-qiang ZHONG
Jun-yu DONG

引用本文:

Qin ZHANG, Guo-qiang ZHONG , Jun-yu DONG. An anchor-based spectral clustering method. Front. Inform. Technol. Electron. Eng., 2018, 19(11): 1385-1396.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/        http://www.zjujournals.com/xueshu/fitee/CN/Y2018/V19/I11/1385

[1] Rabia IRFAN , Sharifullah KHAN, Kashif RAJPOOT, Ali Mustafa QAMAR. TIE algorithm: a layer over clustering-based taxonomy generation for handling evolving data[J]. Front. Inform. Technol. Electron. Eng., 2018, 19(6): 763-782.
[2] A Ram CHOI, Sung Min KIM, Mee Young SUNG. Controlling the contact levels of details for fast and precise haptic collision detection[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(8): 1117-1130.
[3] Ke-shi GE, Hua-you SU, Dong-sheng LI, Xi-cheng LU. Efficient parallel implementation of a density peaks clustering algorithm on graphics processing unit[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(7): 915-927.
[4] Xin-zheng Xu, Shi-fei Ding, Zhong-zhi Shi, Hong Zhu. Optimizing radial basis function neural network based on rough sets and affinity propagation clustering algorithm[J]. Front. Inform. Technol. Electron. Eng., 2012, 13(2): 131-138.
[5] Jing Fan, Hai-feng Ji, Xin-xin Guan, Ying Tang. A GPU-based multi-resolution algorithm for simulation of seed dispersal[J]. Front. Inform. Technol. Electron. Eng., 2012, 13(11): 816-827.
[6] Suiang-Shyan Lee, Ja-Chen Lin. An accelerated K-means clustering algorithm using selection and erasure rules[J]. Front. Inform. Technol. Electron. Eng., 2012, 13(10): 761-768.
[7] Zhen-gong Cai, Xiao-hu Yang, Xin-yu Wang, Aleksander J. Kavs. A fuzzy formal concept analysis based approach for business component identification[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(9): 707-720.
[8] Ji-ming Li, Yun-tao Qian. Clustering-based hyperspectral band selection using sparse nonnegative matrix factorization[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(7): 542-549.
[9] Yan-xia Jin, Kai Zhang, James T. Kwok, Han-chang Zhou. Fast and accurate kernel density approximation using a divide-and-conquer approach[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(9): 677-689.