Towards Achieving Optimal Performance using Stacked Generalization Algorithm: A Case Study of Clinical Diagnosis of Malaria Fever
Journal: The International Arab Journal of Information Technology (Vol.16, No. 6)Publication Date: 2019-11-01
Authors : Abiodun Oguntimilehin; Olusola Adetunmbi; Innocent Osho;
Page : 1074-1081
Keywords : Data mining; ensemble learning; stacked generalization; malaria; diagnosis;
Abstract
The birth of data mining has been a blessing to all fields of endeavours and there are numerous data mining algorithms available today. One of the major problems of mining data is the selection of the appropriate algorithm or model for a job at hand; this has led to different comparison experiments by researchers. Stacked Generalization is one of the methods of combining multiple models to give a better accuracy. The method has been investigated to be effective by many researchers over the years. This study investigates how optimal performance could be achieved using Stacked Generalization algorithm. Six different data mining algorithms (PART, REP Tree, J48, Random Tree, RIDOR and JRIP) arranged in two different orders were used as base learners to two different Meta Learners (Random Forest and NNGE) independently and the results obtained were compared in terms of classification accuracy. The study shows that the order of arrangement of the base learners and the choice of Meta Learner could affect the accuracy of the Stacked Generalization method; NNGE outperforms Random Forest as a Meta-Learner and its performance is independent of the order of arrangement of the base learners as against Random Forest. Malaria fever datasets collected from reputable hospitals in Ado-Ekiti, Ekiti State, Nigeria were purposefully used for this study because malaria is one of the major diseases killing almost a million people yearly in the tropical region of Africa, so a more accurate malaria fever diagnosis model is as well proposed as a result of this study.
Other Latest Articles
- Correlation Dependencies between Variables in Feature Selection on Boolean Symbolic Objects
- The Influence of Data Classification Methods on Predictive Accuracy of Kernel Density Estimation Hotspot Maps
- Anemia and the use of antihypertensive medications in hemodialysis patients: multicenter retrospective observational study
- Tracking Recurring Concepts from Evolving Data Streams using Ensemble Method
- Pneumonia in patients with chronic kidney disease V D stage: pathogenetic aspects of complex therapy and outcomes
Last modified: 2019-11-11 21:49:25