Predicting Students’ Employability using Support Vector Machine: A SMOTE-Optimized Machine Learning System
Journal: International Journal of Emerging Trends in Engineering Research (IJETER) (Vol.8, No. 5)Publication Date: 2019-10-15
Authors : Cherry D. Casuat Enrique D. Festijo; Alvin Sarraga Alon;
Page : 2101-2106
Keywords : Employability prediction system; Decision trees; K-nearest neighbour; Logistic regression; Naïve Bayes; Random Forest; SMOTE; Support Vector Machine;
Abstract
The graduates in every institution reflect the skills developed and competencies acquired by the tudents through the education offered by the institution that is suitable in the companies. Employability of graduates becomes one of the performance indicators for higher educational institutions (HEIs). Therefore, it is important to accentuate the employability of graduates. This is the reason why this research is being carried out. This study involved twenty-seven thousand (27,000) information consist of three thousand (3000) observations and twelve (12) features of student's mock job interview evaluation results (MJI), on-the-job training (OJT) student's performance rating and general point average (GPA) of students enrolled in the on-the-job training course of SY 2015 to SY 2018. To address the issue in imbalance datasets where the minority class, the researchers used synthetic minority over-sampling technique (SMOTE) were applied in this study to address the issue in imbalanced datasets Six learning algorithms with SMOTE were used such as Decision Trees (DT), Random Forest (RF), and Support vector machine (SVM), K- Nearest Neighbor (KNN), Logistic Regression (LR) to understand how students, get employed. The six algorithms were evaluated through the performance matrix as accuracy measures, precision and recall measures, f1-score, and support measures. During the experiments, Support Vector Machine (SVM) obtained 91.22% inaccuracy measures which were significantly better than all of the learning algorithms, DT 85%, RF 84%. The learning curve produced during the experiment displays the training error results which were above the one for validation error while the validation curve displays the testing output where gamma was best at 10 to 100 in gamma 5. This concludes that the model produced with SVM was not under fitted and over-fit. This study is very promising which leads the researchers to be motivated to enhance the process and to validate the produced predictive model for further study
Other Latest Articles
- Influence of fly ash on the properties of recycled coarse aggregate concrete
- Analytical review of the fire situation in the Vladimir region and solutions to improve public safety
- What is ADHD: Tips for Parents
- Gender aspects of emotions
- Psychological diagnostics and prevention of professional burnout of penal correction system employees
Last modified: 2020-06-17 13:24:49