ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

An Efficient Divergence and Distribution Based Similarity Measure for Clustering Of Uncertain Data

Journal: International Journal of Science and Research (IJSR) (Vol.3, No. 3)

Publication Date:

Authors : ; ;

Page : 333-339

Keywords : Clustering; uncertain data; Kernel skew Divergence and distribution;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Data Mining is the extraction of hidden predictive information from large databases. Clustering is one of the popular data mining techniques. Clustering on uncertain data, one of the essential tasks in mining uncertain data, posts significant challenges on both modeling similarity between uncertain objects and developing efficient computational methods. The previous methods extend traditional partitioning clustering methods. Such methods cannot handle uncertain objects that are geometrically indistinguishable, such as products with the same mean but very different variances in customer ratings. Surprisingly, probability distributions, which are essential characteristics of uncertain objects, have not been considered in measuring similarity between uncertain objects. In Existing method to use the well-known Kullback-Leibler divergence to measure similarity between uncertain objects in both the continuous and discrete cases, and integrate it into partitioning and density-based clustering methods to cluster uncertain objects. It is very costly or even infeasible. The proposed work introduces the well-known Kernel skew divergence to measure similarity between uncertain objects in both the continuous and discrete cases. Measuring the cluster similarity with Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space and to further speed up the computation.

Last modified: 2014-04-02 01:33:31