DOCUMENT CLASSIFICATION USING SVM COMBINED WITH OPTIMAL FEATURE SELECTION
Journal: JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (JCET) (Vol.9, No. 3)Publication Date: 2018-06-28
Authors : DEEPANSHU; RAMESH KAIT;
Page : 250-258
Keywords : Feature Selection Techniques; Information Gain; Naïve Bayes; Support Vector Machine; Text Classification;
Abstract
Proper selection of features is more important and effective as compared to the initial feature set which is very large (may reach to hundreds or even thousands). Feature selection is one of the most important steps and plays a good role in the improvement of accuracy and time of processing in text classification. “Curse of dimensionality” is the phenomenon which might degrade the classification accuracy so to overcome the curse of dimensionality it is required to do feature reduction which is of two types- feature selection and feature extraction. In this paper, our approach is to use feature selection based on information gain(IG) on mini-newsgroups and after that comparison of the performance of three classifiers viz. Support Vector Machine(SVM), Naïve Bayes(NB), K-nearest neighbor(kNN) is made and finally, find out that which classifier outperformed the three classifiers. On the basis of classification accuracy, precision, F-measure, and recall, we also differentiated the SVM and SVMfs (with feature selection based on information gain).
Other Latest Articles
Last modified: 2018-09-15 20:24:33