Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2012, Vol. 13 Issue (10): 761-768    DOI: 10.1631/jzus.C1200078
    
An accelerated K-means clustering algorithm using selection and erasure rules
Suiang-Shyan Lee, Ja-Chen Lin
Department of Computer Science, National Chiao Tung University, Taiwan 30050, Hsinchu
An accelerated K-means clustering algorithm using selection and erasure rules
Suiang-Shyan Lee, Ja-Chen Lin
Department of Computer Science, National Chiao Tung University, Taiwan 30050, Hsinchu
 全文: PDF 
摘要: The K-means method is a well-known clustering algorithm with an extensive range of applications, such as biological classification, disease analysis, data mining, and image compression. However, the plain K-means method is not fast when the number of clusters or the number of data points becomes large. A modified K-means algorithm was presented by Fahim et al. (2006). The modified algorithm produced clusters whose mean square error was very similar to that of the plain K-means, but the execution time was shorter. In this study, we try to further increase its speed. There are two rules in our method: a selection rule, used to acquire a good candidate as the initial center to be checked, and an erasure rule, used to delete one or many unqualified centers each time a specified condition is satisfied. Our clustering results are identical to those of Fahim et al. (2006). However, our method further cuts computation time when the number of clusters increases. The mathematical reasoning used in our design is included.
关键词: K-means clusteringAccelerationVector quantizationSelectionErasure    
Abstract: The K-means method is a well-known clustering algorithm with an extensive range of applications, such as biological classification, disease analysis, data mining, and image compression. However, the plain K-means method is not fast when the number of clusters or the number of data points becomes large. A modified K-means algorithm was presented by Fahim et al. (2006). The modified algorithm produced clusters whose mean square error was very similar to that of the plain K-means, but the execution time was shorter. In this study, we try to further increase its speed. There are two rules in our method: a selection rule, used to acquire a good candidate as the initial center to be checked, and an erasure rule, used to delete one or many unqualified centers each time a specified condition is satisfied. Our clustering results are identical to those of Fahim et al. (2006). However, our method further cuts computation time when the number of clusters increases. The mathematical reasoning used in our design is included.
Key words: K-means clustering    Acceleration    Vector quantization    Selection    Erasure
收稿日期: 2012-03-22 出版日期: 2012-10-01
CLC:  TP301.6  
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
Suiang-Shyan Lee
Ja-Chen Lin

引用本文:

Suiang-Shyan Lee, Ja-Chen Lin. An accelerated K-means clustering algorithm using selection and erasure rules. Front. Inform. Technol. Electron. Eng., 2012, 13(10): 761-768.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/10.1631/jzus.C1200078        http://www.zjujournals.com/xueshu/fitee/CN/Y2012/V13/I10/761

[1] Ji-ming Li, Yun-tao Qian. Clustering-based hyperspectral band selection using sparse nonnegative matrix factorization[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(7): 542-549.
[2] Pejman Mowlaee, Abolghasem Sayadian, Hamid Sheikhzadeh. Split vector quantization for sinusoidal amplitude and frequency[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(2): 140-154.
[3] Hong-xia Pang, Wen-de Dong, Zhi-hai Xu, Hua-jun Feng, Qi Li, Yue-ting Chen. Novel linear search for support vector machine parameter selection[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(11): 885-896.
[4] Rui Wang, Wei-feng Chen, Ming-hao Pan, Hu-jun Bao. Harmonic coordinates for real-time image cloning[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(9): 690-698.
[5] Yong-yi Shou, Yi-lun Huang. Combinatorial auction algorithm for project portfolio selection and scheduling to maximize the net present value[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(7): 562-574.
[6] Rui Yin, Yu Zhang, Guan-ding Yu, Zhao-yang Zhang, Jie-tao Zhang. Centralized and distributed resource allocation in OFDM based multi-relay system[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(6): 450-464.
[7] Pejman MOWLAEE, Abolghasem SAYADIYAN, Hamid SHEIKHZADEH. Evaluating single-channel speech separation performance in transform-domain[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(3): 160-174.
[8] Yi Liu, Jian-hua Zhang, Wei Xu, Ze-min Liu. Statistical assessment of selection-based dual-hop semi-blind amplify-and-forward cooperative networks[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(10): 785-792.