ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Predicting Students’ Employability using Support Vector Machine: A SMOTE-Optimized Machine Learning System

Journal: International Journal of Emerging Trends in Engineering Research (IJETER) (Vol.8, No. 5)

Publication Date:

Authors : ; ;

Page : 2101-2106

Keywords : Employability prediction system; Decision trees; K-nearest neighbour; Logistic regression; Naïve Bayes; Random Forest; SMOTE; Support Vector Machine;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The graduates in every institution reflect the skills developed and competencies acquired by the tudents through the education offered by the institution that is suitable in the companies. Employability of graduates becomes one of the performance indicators for higher educational institutions (HEIs). Therefore, it is important to accentuate the employability of graduates. This is the reason why this research is being carried out. This study involved twenty-seven thousand (27,000) information consist of three thousand (3000) observations and twelve (12) features of student's mock job interview evaluation results (MJI), on-the-job training (OJT) student's performance rating and general point average (GPA) of students enrolled in the on-the-job training course of SY 2015 to SY 2018. To address the issue in imbalance datasets where the minority class, the researchers used synthetic minority over-sampling technique (SMOTE) were applied in this study to address the issue in imbalanced datasets Six learning algorithms with SMOTE were used such as Decision Trees (DT), Random Forest (RF), and Support vector machine (SVM), K- Nearest Neighbor (KNN), Logistic Regression (LR) to understand how students, get employed. The six algorithms were evaluated through the performance matrix as accuracy measures, precision and recall measures, f1-score, and support measures. During the experiments, Support Vector Machine (SVM) obtained 91.22% inaccuracy measures which were significantly better than all of the learning algorithms, DT 85%, RF 84%. The learning curve produced during the experiment displays the training error results which were above the one for validation error while the validation curve displays the testing output where gamma was best at 10 to 100 in gamma 5. This concludes that the model produced with SVM was not under fitted and over-fit. This study is very promising which leads the researchers to be motivated to enhance the process and to validate the produced predictive model for further study

Last modified: 2020-06-17 13:24:49