• Center on Health Equity and Access
  • Clinical
  • Health Care Cost
  • Health Care Delivery
  • Insurance
  • Policy
  • Technology
  • Value-Based Care

Study Finds Machine Learning Effective in Evaluating CKD Prognosis


A retrospective study found that 3 machine learning models demonstrated equivalent predictability and greater sensitivity in the prognosis of chronic kidney disease (CKD) compared with the Kidney Failure Risk Equation.

A study published in Scientific Reports found that machine learning was comparable to the Kidney Failure Risk Equation (KFRE) in evaluating the prognosis of chronic kidney disease (CKD) in 3 models: logistic regression, naïve Bayes, and random forest.

The data for this study were from a longitudinal cohort previously enrolled in an observational study. Patients 18 years and older with stable kidney function for at least 3 months were included in this study. Exclusion criteria were a history of kidney replacement therapy, any other existing condition deemed physically unstable, or any preexisting malignancy. Patients were recruited between April 2006 and March 2008.

Demographic information was collected on age, gender, education level, marriage status, and insurance status. Medical history included smoking status, history of alcohol consumption, and presence of diabetes, cardiovascular disease, and hypertension.

There were 748 patients included in this study, and the mean (SD) follow-up was 6.3 (2.3) years. Most patients had stage 2 (24.5%) or 3 (47.1%) CKD at baseline and end-stage kidney disease (ESKD) was found in 9.4%, all of whom eventually received kidney replacement therapy.

There were no significant differences between the imputed datasets and the original dataset when missing data were replaced by imputed values. The best overall performance was with the random forest algorithm, which had overlap with the other 3 models.

The logistic regression model had the highest mean sensitivity (0.79; 95% CI, 0.73-0.85), the naïve Bayes model had the highest mean accuracy (0.86; 95% CI, 0.85-0.87) and specificity (0.87; 95% CI, 0.86-0.89), and the random forest model had the highest mean area under curve (AUC) score (0.81; 95% CI, 0.78-0.83) of the 3 models.

The KFRE model had a comparable AUC score and also had the highest mean accuracy (0.90; 95% CI, 0.90-0.91), specificity (0.95; 95% CI, 0.45-0.55), and precision (0.50; 95% CI, 0.45-0.55). However, the KFRE model had the lowest mean sensitivity (0.47; 95% CI, 0.42-0.52).

There were some limitations to this study. The cohort consisted of less than 1000 participants, and ESKD was only present in a small portion, which may have affected the performance of the models. In addition, the study only focused on ESKD prognosis without urine because there was a lack of urine tests when the cohort was established.

The researchers concluded that the study demonstrated that machine learning could evaluate the prognosis of CKD effectively, where logistic regression, naïve Bayes, and random forest had comparable predictability compared with KFRE. The machine learning models also had greater sensitivity scores which could help with future patient screenings.


Bai Q, Su C, Tang W, Li Y. Machine learning to predict end stage kidney disease in chronic kidney disease. Sci Rep. Published online May 19, 2022. doi:10.1038/s41598-022-12316-z

Related Videos
Kirsten Johansen, MD.
Jennifer Green, MD.
Related Content
© 2023 MJH Life Sciences
All rights reserved.