An efficient distance estimation and centroid selection based on k-means clustering for small and large dataset
Journal: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.7, No. 73)Publication Date: 2020-12-29
Authors : Girdhar Gopal Ladha; Ravi Kumar Singh Pippal;
Page : 234-240
Keywords : K-means; Distance estimation; Centroid selection; Distance methods.;
Abstract
In this paper an efficient distance estimation and centroid selection based on k-means clustering for small and large dataset. Data pre-processing was performed first on the dataset. For the complete study and analysis PIMA Indian diabetes dataset was considered. After pre-processing distance and centroid estimation was performed. It includes initial selection based on randomization and then centroids updations were performed till the iterations or epochs determined. Distance measures used here are Euclidean distance (Ed), Pearson Coefficient distance (PCd), Chebyshev distance (Csd) and Canberra distance (Cad). The results indicate that all the distance algorithms performed approximately well in case of clustering but in terms of time Cad outperforms in comparison to other algorithms.
Other Latest Articles
- Effect of Magnus Force on Spin Stabilized Missile in Normal Crosswind Conditions
- Transient Analysis and Optimization of Annular Elliptical Fins with Mixed Boundary Condition
- Ansys Analysis of Vortex Enhanced Laminar Flow Mixing of Different Fluids
- A computational model for optimum process parameters based on factory data and overall liquor rating of black tea
- Supply of Troops with Weapons, Equipment and Material and Technical Means: Problems, Trends of Changes, Prospects
Last modified: 2021-01-06 14:38:12