ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

A Fuzzy C ? Means Clustering Based on Density Sensitive Distance Metric with a Novel Penalty Term

Journal: International Journal of Science and Research (IJSR) (Vol.3, No. 5)

Publication Date:

Authors : ; ; ;

Page : 347-346

Keywords : Clustering; FCM; PFCM; MPFCM; Dataset;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

A cluster is a group of objects which are similar to each other within a cluster and are dissimilar to the objects of other clusters. The major objective of clustering is to discover collection of comparable objects based on similarity metric. A similarity metric is generally specified by the user according to the requirements for obtaining better results. The distance between the measures of two objects in a particular cluster should be well defined using effective distance measures. There are several approaches available for clustering objects. The clustering approaches are, Penalty Fuzzy C-Means. But these techniques are not suitable for all applications and huge data collections. In the proposed approach an effective fuzzy clustering technique is used. Fuzzy Possibilistic C-Means (FPCM) is the effective clustering algorithm available to cluster unlabeled data that produces both membership and typicality values during clustering process. Penalized and Compensated terms are embedded with the Modified fuzzy positivistic clustering method’s objective function to construct the Penalized based FPCM (PFPCM). In order to improve the clustering accuracy, third proposed approach uses the Improved Penalized Fuzzy C-Means (IPFCM). The penalty term takes the spatial dependence of the objects into consideration, which is inspired by the Neighborhood Expectation Maximization (NEM) algorithm and is modified according to the criterion of FCM. The proposed Improved Penalized for Fuzzy C-Means (IPFCM) clustering algorithm, uses improved penalized constraints which will help in better calculation of distance between the clusters and increasing the accuracy of clustering. The performance of the proposed approaches is evaluated on the University of California, Irvine (UCI) machine repository datasets such as Iris, Wine, Lung Cancer and Lymphograma. The parameters used for the evaluation is Clustering accuracy, Mean Squared Error (MSE), Execution Time and Convergence behavior.

Last modified: 2014-07-01 21:17:13