An Efficient Hierarchical Clustering Algorithm for Large Datasets
Journal: Austin Journal of Proteomics, Bioinformatics & Genomics (Vol.2, No. 1)Publication Date: 2015-02-23
Authors : Olga Tanaseichuk Alireza Hadj Khodabakshi Dimitri Petrov Jianwei Che Tao Jiang Bin Zhou Andrey Santrosyan; Yingyao Zhou;
Page : 1-6
Keywords : Hybrid hierachical clustering; Hierachical clustering; k -means clustering; Large datasets;
Abstract
Hierarchical clustering is a widely adopted unsupervised learning algorithm for discovering intrinsic groups embedded within a dataset. Standard implementations of the exact algorithm for hierarchical clustering require ( ) 2 O n time and ( ) 2 O n memory and thus are unsuitable for processing datasets containing more than 20 000 objects. In this study, we present a hybrid hierarchical clustering algorithm requiring approximately O n n ( ) time and O n n ( ) memory while still preserving the most desirable properties of the exact algorithm. The algorithm was capable of clustering one million compounds within a few hours on a single processor. The clustering program is freely available to the research community
Other Latest Articles
- Interacting Network Analysis and Functional Profiling to Look Inside Adverse Ventricular Remodeling Post- Myocardial Infarction
- THE FUNCTIONAL PRODUCT BASED ON THE LOW-FAT CREAM ENRICHED WITH VEGETABLE INGREDIENTS
- THE FORMATION OF CONSUMER PROPERTIES OF YOGHURT ENRICHED WITH BAA «EUFLORIN-B»
- TECHNOLOGY OF TWO-LAYER COTTAGE CHEESE DESSERT FOR ATHLETES
- STRUCTURAL AND MECHANICAL PROPERTIES AGARIZED JELLY MASS ON THE BASIS OF WHEY
Last modified: 2017-10-30 15:19:58