ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

An Ensemble Multi-Label Feature Selection Algorithm Based on Information Entropy

Journal: The International Arab Journal of Information Technology (Vol.11, No. 4)

Publication Date:

Authors : ; ; ;

Page : 379-386

Keywords : Data mining; ensembles; feature extraction; feature selection; information entropy; multi-label classification.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

In multi-label classification, feature selection is able to remove redundant and irrelevant features, which makes the classifiers faster and improves the prediction performance of the classifiers. Currently, most of feature selection algorithms in multi-label classification are dependent on the concrete classifier, which leads to high computation complexity. Hence this paper proposes an Ensemble Multi-label Feature Selection algorithm based on Information Entropy (EMFSIE), which is independent on any concrete classifiers. Its core idea consists of: Employing the information gain to evaluate the correlation between the feature and the label set, and filtering out useful features more effectively. We calculate the information gain in an ensemble framework and filter out useful features according to the threshold value determined by the effective factor. We validate EMFSIE on four datasets from two domains using four different multi-label classifiers. The experimental results and their analysis show preliminarily that EMFSIE can not only remove more than 70% of original features, which makes the classifiers faster, but also keep the prediction performance of the classifiers as good as before, even enhance the prediction performance on three datasets underthe two-tailed paired t-tests at 0.05 significance level.

Last modified: 2019-11-17 21:20:53