ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Comparative analysis of classification algorithm evaluations to predict secondary school students’ achievement in core and elective subjects

Journal: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.9, No. 89)

Publication Date:

Authors : ; ;

Page : 430-445

Keywords : Classification; Prediction; Educational data mining; Students’ performance predictions.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Many researchers in educational data mining (EDM) have explored various machine learning techniques in order to predict students' performance. However, the most daunting challenge in classification modelling is selecting the most effective algorithm with the highest accuracy. A study was conducted using datasets from two Malaysian premier secondary schools, Maktab Rendah Sains Mara (MRSM) Kuala Berang and Kuala Terengganu. The purpose of this study is to respond to two key questions; the first is to examine which algorithm is the best in predicting secondary students' achievement in core and elective subjects, while the second is to study whether the same features and algorithms are capable of predicting academic performance based on students' first semester achievement. To do so, this study analysed the effectiveness of six different classification algorithms, which are naïve Bayes (NB), random forest (RF), k-nearest neighbour (kNN), support vector machine (SVM), sequential minimal optimization (SMO), and logistic regression (LGR). Each model's prediction accuracy was evaluated using 10-fold cross validation in order to identify the best model. The results showed that the RF model outperformed other models in terms of accuracy, precision, recall, and F1-Measure. With most algorithms achieving significant accuracy levels for both core and elective subjects' dataset. It is concluded that the prediction of secondary school students' achievement can begin as early as the first semester using RF for core and elective subjects with biology dataset. The accuracy obtained was 96.7% and 97.5%, respectively for the core and elective subjects.

Last modified: 2022-05-30 17:07:20