A REVIEW ON CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM FOR HIGH DIMENSIONAL DATA
Journal: International Journal of Engineering Sciences & Research Technology (IJESRT) (Vol.4, No. 1)Publication Date: 2015-01-30
Authors : Madhuri B Patil; Anil Rao;
Page : 199-202
Keywords : Minimum Spanning Tree; Good Feature subset selection; Clustering;
Abstract
In HD dataset, feature selection involves identifying the subset of good features by using clustering approach. Feature selection involves removal of irrelevant and redundant features which are the essential data preprocessing activities for effective data mining. A Clustering based approach for good feature selection evaluated from both the efficiency and effectiveness points of view. Efficiency relates the time required to find a subset of good features while the effectiveness is related to the quality of the subset of features. The feature selection algorithm for high dimensional data produces the more compatible results as the original entire set of results based on search strategies, evaluation criteria, and data mining tasks. It reveals unattempted combinations, and provides guidelines in selection of feature selection algorithms. FAST algorithm for feature subset selection works in two steps. The first step involves distribution of feature subsets into clusters by using graph-theoretic clustering methods and the second step involves selection of most useful ,efficient features that is strongly related to the target classes which form the subset of good features. In FAST algorithm to increase the the efficiency we adopted efficient Minimum Spanning Tree clustering method. Based on some of these criteria, a clustering-based feature selection algorithm for HD data is proposed and experimentally evaluated in this paper.
Other Latest Articles
Last modified: 2015-01-17 21:13:40