Efficient Seed and K-Value Selection in K-Means Clustering using Relative Weight and New Distance Metric
Journal: International Journal of Science and Research (IJSR) (Vol.6, No. 1)Publication Date: 2017-01-05
Authors : Premsagar Dandge; Aruna Gupta;
Page : 2084-2087
Keywords : k-means clustering; categorical data; dynamic attribute weight; frequency probability; data density;
Abstract
K-mean clustering algorithm is used for clustering the data points which are similar to each other. K-means algorithm is popular due to its simplicity and convergence tendency. The general distance metrics used this algorithm are Euclidian distance, Manhattan distance etc. which are best suited for numeric data like geometric coordinates. These distance metrics does not given full proof results for categorical data. We will be using a new distance metric for calculating the similarity between the categorical data points. The new distance metric uses dynamic attribute weight and frequency probability to differentiate the data points. This ensures the use of categorical properties of the attributes considered while clustering. The k-mean algorithm needs the information about number of clusters present in the dataset in advance before proceeding for cluster analysis. We will be using a different technique for finding out the number of clusters which is based on the data density distribution. Also the initial cluster seeds are selected in a random fashion which may lead to more iteration required for convergent solution. In proposed method, seeds are selected considering the density distribution which ensures the even distribution of initial seed selection. This will reduce the overall iteration required for convergent solution.
Other Latest Articles
- A Review of Solar Energy - Challenges, Economics&Policies in India
- Data Mining from Heterogeneous Data Sources
- Literacy Transition in Scheduled Caste Population: A Study of Rural Haryana
- An Efficient Architecture for Resource Provisioning in Fog Computing
- Potential, Pitfalls and Challenges in Development of ICT in Education
Last modified: 2021-06-30 17:35:27