ResearchBib Share Your Research, Maximize Your Social Impacts
注册免费获得最新研究资源 注册 >> 登录

Comparative analysis of classification algorithm evaluations to predict secondary school students’ achievement in core and elective subjects

期刊名字: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.9, No. 89)

Publication Date:

论文作者 : ; ;

起始页码 : 430-445

关键字 : Classification; Prediction; Educational data mining; Students’ performance predictions.;

论文网址 : Downloadexternal 您也可以查找论文通过 : Google Scholarexternal

论文摘要

Many researchers in educational data mining (EDM) have explored various machine learning techniques in order to predict students' performance. However, the most daunting challenge in classification modelling is selecting the most effective algorithm with the highest accuracy. A study was conducted using datasets from two Malaysian premier secondary schools, Maktab Rendah Sains Mara (MRSM) Kuala Berang and Kuala Terengganu. The purpose of this study is to respond to two key questions; the first is to examine which algorithm is the best in predicting secondary students' achievement in core and elective subjects, while the second is to study whether the same features and algorithms are capable of predicting academic performance based on students' first semester achievement. To do so, this study analysed the effectiveness of six different classification algorithms, which are naïve Bayes (NB), random forest (RF), k-nearest neighbour (kNN), support vector machine (SVM), sequential minimal optimization (SMO), and logistic regression (LGR). Each model's prediction accuracy was evaluated using 10-fold cross validation in order to identify the best model. The results showed that the RF model outperformed other models in terms of accuracy, precision, recall, and F1-Measure. With most algorithms achieving significant accuracy levels for both core and elective subjects' dataset. It is concluded that the prediction of secondary school students' achievement can begin as early as the first semester using RF for core and elective subjects with biology dataset. The accuracy obtained was 96.7% and 97.5%, respectively for the core and elective subjects.

更新日期: 2022-05-30 17:07:20