An Ensemble Multi-Label Feature Selection Algorithm Based on Information Entropy
Journal: The International Arab Journal of Information Technology (Vol.11, No. 4)Publication Date: 2014-07-01
Authors : Shining Li; Zhenhai Zhang; Jiaqi Duan;
Page : 379-386
Keywords : Data mining; ensembles; feature extraction; feature selection; information entropy; multi-label classification.;
Abstract
In multi-label classification, feature selection is able to remove redundant and irrelevant features, which makes the classifiers faster and improves the prediction performance of the classifiers. Currently, most of feature selection algorithms in multi-label classification are dependent on the concrete classifier, which leads to high computation complexity. Hence this paper proposes an Ensemble Multi-label Feature Selection algorithm based on Information Entropy (EMFSIE), which is independent on any concrete classifiers. Its core idea consists of: Employing the information gain to evaluate the correlation between the feature and the label set, and filtering out useful features more effectively. We calculate the information gain in an ensemble framework and filter out useful features according to the threshold value determined by the effective factor. We validate EMFSIE on four datasets from two domains using four different multi-label classifiers. The experimental results and their analysis show preliminarily that EMFSIE can not only remove more than 70% of original features, which makes the classifiers faster, but also keep the prediction performance of the classifiers as good as before, even enhance the prediction performance on three datasets underthe two-tailed paired t-tests at 0.05 significance level.
Other Latest Articles
Last modified: 2019-11-17 21:20:53