Comparative Analysis of Machine Learning Models for Data Driven Chronic Kidney Disease Prediction

Journal: International Journal of Trend in Scientific Research and Development (Vol.10, No. 1)

Publication Date: 2026-02-10

Authors : Deepti Rani Pattanaik Monalisha Pattnaik;

Page : 1092-1106

Keywords : Chronic Kidney Disease; Machine Learning; XGBoost; Random Forest; SHAP; Feature Selection; Data Balancing; Predictive Modeling; Healthcare Analytics.;

Source : Download Find it from : Google Scholar

Abstract

Timely detection of Chronic Kidney Disease CKD is important to develop patient outcomes and reduce the burden of end stage renal failure. ‘Machine learning ML ' techniques offer promising tools for early and accurate prediction of CKD by leveraging clinical, demographic, and lifestyle data. This research intended to identify the most relevant clinical, demographic, and lifestyle indicators of CKD, and assess the predictive precision of several machine learning models, and enhance model interpretability through explainable AI techniques. This study utilized a balanced dataset derived through the ‘Random Over Sampling Examples ROSE technique', addressing the inherent class imbalance between CKD and non CKD cases. Feature selection was conducted using a hybrid approach combining ‘Recursive Feature Elimination RFE ' and Random Forest importance metrics to detect the supreme influential predictors. Five machine learning models “Logistic Regression”, “Random Forest”, “Support Vector Machine SVM ”, “Decision Tree”, and “XGBoost” were instructed and assessed. Performance was assessed by means of “Accuracy”, “Sensitivity”, “Specificity”, “Kappa statistic”, and “Area Under the Receiver Operating Characteristic Curve AUC ”. Model interpretability was further enriched through Shapley Additive Explanations SHAP analysis. Amongst the models tested, XGBoost attained the highest testing accuracy 97.79 and AUC 0.9979 , followed thoroughly by Random Forest. SHAP analysis revealed that clinical markers such as “Serum Creatinine”, “Glomerular Filtration Rate GFR ”, “Protein in Urine”, and “Fasting Blood Sugar” were the most significant contributors to model predictions. Interpretability assessments confirmed that model outputs were consistent with clinical knowledge of CKD risk factors. ‘Machine Learning Models', particularly ensemble methods like XGBoost and Random Forest, can reliably predict chronic kidney disease when united with effective feature selection and data balancing approaches. Incorporating model interpretability techniques such as SHAP values ensures transparency and fosters trust in predictive analytics for clinical applications. To improve early CKD detection and management, future research should incorporate with clinical decision support systems and external validation. Deepti Rani Pattanaik | Monalisha Pattnaik "Comparative Analysis of Machine Learning Models for Data-Driven Chronic Kidney Disease Prediction" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-10 | Issue-1 , February 2026, URL: https://www.ijtsrd.com/papers/ijtsrd100179.pdf Paper URL: https://www.ijtsrd.com/other-scientific-research-area/other/100179/comparative-analysis-of-machine-learning-models-for-datadriven-chronic-kidney-disease-prediction/deepti-rani-pattanaik

Main Menu

Searching By

PARTNERS

Comparative Analysis of Machine Learning Models for Data Driven Chronic Kidney Disease Prediction

Abstract

Advertisement