Hybrid Fuzzy Data Clustering Algorithm using Different Distance Metrics: A Comparative Study
O. A. Mohamed Jafar1, R. Sivakumar2
1O.A. Mohamed Jafar, Research Scholar, P.G. & Research Department of Computer Science, A.V.V.M. Sri Pushpam College (Autonomous), Poondi, Thanjavur, Tamil Nadu, India.
2R. Sivakumar, Associate Professor, P.G. & Research Department of Computer Science, A.V.V.M. Sri Pushpam College (Autonomous), Poondi, Thanjavur, Tamil Nadu, India.
Manuscript received on December 08, 2014. | Revised Manuscript received on December 15, 2014. | Manuscript published on January 05, 2014. | PP: 218-225 | Volume-3 Issue-6, January 2014. | Retrieval Number: F2046013614/2014©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Clustering is the process of grouping a set of objects into a number of clusters. K-means and Fuzzy c-means (FCM) algorithm have been extensively used in cluster analysis. However, they are sensitive to noise and do not include any information about spatial context. A Penalized Fuzzy c-means algorithm (PFCM) was developed to overcome the drawbacks of FCM algorithm. Euclidean distance measure is commonly used by many researchers in traditional clustering algorithms. In this paper, a comparative study on hybrid fuzzy data clustering algorithm using different distance metrics such as Euclidean, City Block and Chessboard is proposed. The K-means, FCM and hybrid K-PFCM algorithms are experimented and tested on five real-world benchmark data sets from UCI machine learning repository. The experimental results show that FCM and hybrid K-PFCM algorithms report good performance for Chessboard distance. The hybrid K-PFCM algorithm shows best objective function value than K-means algorithm. The performance of the algorithms is also evaluated through standard cluster validity measures. The Hybrid K-PFCM algorithm is effective under the criteria of PC, PE and intra-cluster distance.
Keywords: Data clustering, K-means, Fuzzy c-means, Penalized Fuzzy c-means, Hybrid K-PFCM, Distance metrics, Cluster validity measures.