ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Efficient ensemble machine learning techniques for early prediction of diphtheria diseases based on clinical data

Journal: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.9, No. 90)

Publication Date:

Authors : ; ;

Page : 583-603

Keywords : Ensemble machine learning; Baseline classifiers; Diphtheria disease; SMOTE+ENN; Multiclass classification.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Diphtheria is a worldwide concern, particularly in Yemen. Early detection is important for reducing diphtheria deaths. In fact, proper diphtheria diagnosis takes time due to various clinical examinations. This problem requires the development of a new diagnostic system. With machine learning (ML) techniques, continuing to be proposed, ensemble learning techniques have been introduced into healthcare applications. Efficient ensemble ML techniques (EEMLT) are used to develop prediction models for diphtheria disease in this study. Five ensemble ML models i.e., random forest classifier (RFC), gradient boosting classifier (GBC), extra tree classifier (ETC), eXtreme gradient boosting (XGB), and light gradient boosting machine (LightGBM) were used. Moreover, five popular baseline classifiers, i.e., logistic regression (LR), k-nearest neighbors (KNN), support vector classifier (SVC), decision tree classifier (DTC), multilayer perceptron (MLP), were used as benchmarks. All ensemble and baseline classifiers are trained and tested in the dataset using 10-fold cross-validation (CV) and holdout CV approaches. All models were evaluated on a test set using different metrics including accuracy, F1-sore, Recall, Precision, and area under curve (AUC) measures. According to the results of this study, the ETC model achieved high accuracy with 98.92% and 99.2% in holdout and 10-fold CV, respectively. It is found that the ETC achieved high accuracy of 99.2% in 10-fold and holdout CV approach. Finally, the experimental results reveal that the performance of ensemble classifiers has outperformed those of baseline classifiers. We believe that the proposed diphtheria prediction system will help doctors accurately predict diphtheria disease.

Last modified: 2022-07-01 21:56:47