ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

DOCUMENT CLASSIFICATION USING SVM COMBINED WITH OPTIMAL FEATURE SELECTION

Journal: JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (JCET) (Vol.9, No. 3)

Publication Date:

Authors : ; ;

Page : 250-258

Keywords : Feature Selection Techniques; Information Gain; Naïve Bayes; Support Vector Machine; Text Classification;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Proper selection of features is more important and effective as compared to the initial feature set which is very large (may reach to hundreds or even thousands). Feature selection is one of the most important steps and plays a good role in the improvement of accuracy and time of processing in text classification. “Curse of dimensionality” is the phenomenon which might degrade the classification accuracy so to overcome the curse of dimensionality it is required to do feature reduction which is of two types- feature selection and feature extraction. In this paper, our approach is to use feature selection based on information gain(IG) on mini-newsgroups and after that comparison of the performance of three classifiers viz. Support Vector Machine(SVM), Naïve Bayes(NB), K-nearest neighbor(kNN) is made and finally, find out that which classifier outperformed the three classifiers. On the basis of classification accuracy, precision, F-measure, and recall, we also differentiated the SVM and SVMfs (with feature selection based on information gain).

Last modified: 2018-09-15 20:24:33