Variability analysis of the hierarchical clustering algoritms and its implication on consensus clustering
Journal: International Journal of Advanced Engineering Research and Science (Vol.4, No. 5)Publication Date: 2017-05-08
Authors : Lucia Sousa;
Page : 118-131
Keywords : Data Mining; Cluster analysis; Consensus clustering; Hierarchical clustering algorithm; Validation indices.;
Abstract
Clustering is one of the most important unsupervised learning tools when no prior knowledge about the data set is available. Clustering algorithms aim to find underlying structure of the data sets taking into account clustering criteria, properties in the data and specific way of data comparison. In the literature many clustering algorithms have been proposed having a common goal which is, given a set of objects, grouping similar objects in the same cluster and dissimilar objects in different clusters. Hierarchical clustering algorithms are of great importance in data analysis providing knowledge about the data structure. Due to the graphical representation of the resultant partitions, through a dendrogram, may give more information than the clustering obtained by non hierarchical clustering algorithms. The use of different clustering methods for the same data set, or the use of the same clustering method but with different initializations (different parameters), can produce different clustering. So several studies have been concerned with validate the resulting clustering analyzing them in terms of stability / variability, and also, there has been an increasing interest on the problem of determining a consensus clustering. This work empirically analyzes the clustering variability delivered by hierarchical algorithms, and some consensus clustering techniques are also investigated. By the variability of hierarchical clustering, we select the most suitable consensus clustering technique existing in literature. Results on a range of synthetic and real data sets reveal significant differences of the variability of hierarchical clustering as well as different performances of the consensus clustering techniques.
Other Latest Articles
- Optimization of Ply Orientation of Different Composite Materials for Aircraft Wing
- Proposing a Popular Method for Meteorological Drought Monitoring in the Kabul River Basin, Afghanistan
- Emotional Intelligence in High School: The Effects of Self-Awareness Instruction on Iranian Pre-university Students' Academic Achievement
- SOLAR ENERGY APPLICATION IN HOUSES HEATING SYSTEMS IN RUSSIA
- GEOGRAPHY AND ENVIRONMENT – ANALYSIS OF INDICATORS OF SUSTAINABLE DEVELOPMENT OF TOURISM
Last modified: 2017-07-03 03:16:51