Machine Learning to Early Prediction of Chronic Kidney Disease: Using Imbalanced and Limited Size Data Sets (Preprint)

2021 ◽  
Author(s):  
Andressa C. M. da Silveira ◽  
Álvaro Sobrinho ◽  
Leandro Dias da Silva ◽  
Evandro de Barros Costa ◽  
Angelo Perkusich ◽  
...  

BACKGROUND Chronic kidney disease (CKD) is a worldwide public health problem, usually diagnosed in the late stages of the disease, increasing public health costs and mortality rates. The late diagnosis is even more critical in low- and middle-income countries due to the high poverty levels, many hard-to-reach locations, and sometimes lack/precarious primary care. Therefore, to alleviate these issues, investment in early prediction is necessary. OBJECTIVE The purpose of this study is to assist the early prediction of CKD, addressing problems related to imbalanced and limited-size data sets. METHODS To address our multi-class problem (low risk, medium risk, high risk, and very high risk), we used data from medical records of 60 Brazilians with or without a diagnosis of CKD, containing the following attributes: hypertension, diabetes mellitus, creatinine, urea, albuminuria, age, gender, and glomerular filtration rate. We used two approaches for oversampling: (1) manual augmentation with data validated by an experienced nephrologist and (2) automated augmentation with the synthetic minority oversampling technique (SMOTE), borderline-SMOTE, and Borderline-SMOTE support vector machine. We implemented classification models based on such data sets and the algorithms: decision tree (DT), random forest, and multi-class AdaBoosted DTs. We also applied the overall local accuracy and local class accuracy methods for dynamic classifier selection; and the k-nearest oracles-union, k-nearest oracles-eliminate, and META-DES for dynamic ensemble selection. We analyzed the models' performances using the hold-out validation, multiple stratified cross-validation (CV), and nested CV. We also computed the importance of features using feature selection methods. RESULTS The best performance was achieved using the DT and multi-class AdaBoosted DTs classification models, oversampled with SMOTE, and validated with the multiple stratified CV and nested CV methods. The DT model presented the highest accuracy score (98.99%) for both multiple stratified CV and nested CV, followed by multi-class AdaBoosted DTs (97.99% and 98.00%), respectively. CONCLUSIONS The SMOTE and multiple stratified CV or nested CV methods provided reliable results for such an imbalanced and limited size data set. During CKD monitoring, based on the DT model, assuming the previous DM evaluation, the user only needs to perform two blood tests: creatinine and urea. Thus, the DT model can assist in designing systems for the early prediction of CKD, providing easy interpretation and cost reduction.

PLoS ONE ◽  
2021 ◽  
Vol 16 (10) ◽  
pp. e0258494
Author(s):  
Nipun Shrestha ◽  
Sanju Gautam ◽  
Shiva Raj Mishra ◽  
Salim S. Virani ◽  
Raja Ram Dhungana

Background Chronic kidney disease (CKD) is an emerging public health issue globally. The prevalence estimates on CKD in South Asia are however limited. This study aimed to examine the prevalence of CKD among the general and high-risk population in South Asia. Methods We conducted a systematic review and meta-analysis of population-level prevalence studies in South Asia (Afghanistan, Bangladesh, Bhutan, Maldives, Nepal, India, Pakistan, and Sri Lanka). Three databases namely PubMed, Scopus and Web of Science were systematically searched for published reports of kidney disease in South Asia up to 28 October 2020. A random-effect model for computing the pooled prevalence was used. Results Of the 8749 identified studies, a total of 24 studies were included in the review. The pooled prevalence of CKD among the general population was 14% (95% CI 11–18%), and 15% (95% CI 11–20%) among adult males and 13% (95% CI 10–17%) in adult females. The prevalence of CKD was 27% (95% CI 20–35%) in adults with hypertension, 31% (95% CI 22–41%) in adults with diabetes and 14% (95% CI 10–19%) in adults who were overweight/obese. We found substantial heterogeneity across the included studies in the pooled estimates for CKD prevalence in both general and high-risk populations. The prevalence of CKD of unknown origin in the endemic population was 8% (95% CI 3–16%). Conclusion Our study reaffirms the previous reports that CKD represents a serious public health challenge in South Asia, with the disease prevalent among 1 in 7 adults in South Asian countries.


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Demetria Hubbard ◽  
Lisandro D. Colantonio ◽  
Robert S. Rosenson ◽  
Todd M. Brown ◽  
Elizabeth A. Jackson ◽  
...  

Abstract Background Adults who have experienced multiple cardiovascular disease (CVD) events have a very high risk for additional events. Diabetes and chronic kidney disease (CKD) are each associated with an increased risk for recurrent CVD events following a myocardial infarction (MI). Methods We compared the risk for recurrent CVD events among US adults with health insurance who were hospitalized for an MI between 2014 and 2017 and had (1) CVD prior to their MI but were free from diabetes or CKD (prior CVD), and those without CVD prior to their MI who had (2) diabetes only, (3) CKD only and (4) both diabetes and CKD. We followed patients from hospital discharge through December 31, 2018 for recurrent CVD events including coronary, stroke, and peripheral artery events. Results Among 162,730 patients, 55.2% had prior CVD, and 28.3%, 8.3%, and 8.2% had diabetes only, CKD only, and both diabetes and CKD, respectively. The rate for recurrent CVD events per 1000 person-years was 135 among patients with prior CVD and 110, 124 and 171 among those with diabetes only, CKD only and both diabetes and CKD, respectively. Compared to patients with prior CVD, the multivariable-adjusted hazard ratio for recurrent CVD events was 0.92 (95%CI 0.90–0.95), 0.89 (95%CI: 0.85–0.93), and 1.18 (95%CI: 1.14–1.22) among those with diabetes only, CKD only, and both diabetes and CKD, respectively. Conclusion Following MI, adults with both diabetes and CKD had a higher risk for recurrent CVD events compared to those with prior CVD without diabetes or CKD.


Nutrients ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 1205
Author(s):  
Yoshitaka Isaka

Multi-factors, such as anorexia, activation of renin-angiotensin system, inflammation, and metabolic acidosis, contribute to malnutrition in chronic kidney disease (CKD) patients. Most of these factors, contributing to the progression of malnutrition, worsen as CKD progresses. Protein restriction, used as a treatment for CKD, can reduce the risk of CKD progression, but may worsen the sarcopenia, a syndrome characterized by a progressive and systemic loss of muscle mass and strength. The concomitant rate of sarcopenia is higher in CKD patients than in the general population. Sarcopenia is also associated with mortality risk in CKD patients. Thus, it is important to determine whether protein restriction should be continued or loosened in CKD patients with sarcopenia. We may prioritize protein restriction in CKD patients with a high risk of end-stage kidney disease (ESKD), classified to stage G4 to G5, but may loosen protein restriction in ESKD-low risk CKD stage G3 patients with proteinuria <0.5 g/day, and rate of eGFR decline <3.0 mL/min/1.73 m2/year. However, the effect of increasing protein intake alone without exercise therapy may be limited in CKD patients with sarcopenia. The combination of exercise therapy and increased protein intake is effective in improving muscle mass and strength in CKD patients with sarcopenia. In the case of loosening protein restriction, it is safe to avoid protein intake of more than 1.5 g/kgBW/day. In CKD patients with high risk in ESKD, 0.8 g/kgBW/day may be a critical point of protein intake.


2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
Takeshi Hasegawa ◽  
Hiroki Nihiwaki ◽  
Erika Ota ◽  
William Levack ◽  
Hisashi Noma

Abstract Background and Aims Patients with chronic kidney disease (CKD) undergoing dialysis are at a particularly high risk of cardiovascular mortality and morbidity. This systematic review and meta-analysis aimed to evaluate the benefits and harms of aldosterone antagonists, both non-selective (spironolactone) and selective (eplerenone), in comparison to control (placebo or standard care) in patients with CKD requiring haemodialysis or peritoneal dialysis. Method We searched the Cochrane Kidney and Transplant Register of Studies up to 29 July 2019 using search terms relevant to this review. Studies in the Register are identified through searches of CENTRAL, MEDLINE, and EMBASE, conference proceedings, the International Clinical Trials Register Search Portal and ClinicalTrials.gov. We included individual and cluster randomised controlled trials (RCTs), cross-over trials, and quasi-RCTs that compared aldosterone antagonists with placebo or standard care in patients with CKD requiring dialysis. We used a random-effects model meta-analysis to perform a quantitative synthesis of the data. We used the I2 statistic to measure heterogeneity among the trials in each analysis. We indicated summary estimates as a risk ratio (RR) for dichotomous outcomes with their 95% confidence interval (CI). We assessed the certainty of the evidence for each of the main outcomes using the GRADE (Grades of Recommendation, Assessment, Development, and Evaluation) approach. Results We included 16 trials (14 parallel RCTs and two cross-over trials) involving a total of 1,446 patients. Among included studies, 13 trials compared spironolactone to placebo or standard care and one trial compared eplerenone to a placebo. Most studies had an unclear or high risk of bias. Compared to control, aldosterone antagonists reduced the risk of all-cause death for patients with CKD requiring dialysis (9 trials, 1,119 patients: RR 0.45, 95% CI 0.30 to 0.67; moderate certainty of evidence). Aldosterone antagonist also decreased the risk of death due to cardiovascular disease (6 trials, 908 patients: RR 0.37, 95% CI 0.22 to 0.64; moderate certainty of evidence) and cardiovascular and cerebrovascular morbidity (3 trials, 328 patients: RR 0.38, 95% CI 0.18 to 0.76; moderate certainty of evidence). While aldosterone antagonists had an apparent increased risk of gynaecomastia compared with control (4 trials, 768 patients: RR 5.95, 95% CI 1.93 to 18.3; moderate certainty of evidence), the elevated risk of hyperkalaemia due to aldosterone antagonists was uncertain (9 trials, 981 patients: RR 1.41, 95% CI 0.72 to 2.78; low certainty of evidence). Conclusion Based on moderate certainty of the evidence, aldosterone antagonists could reduce the risk of all-cause and cardiovascular death and morbidity due to cardiovascular and cerebrovascular disease but increase the risk of gynaecomastia in patients with CKD requiring dialysis.


2022 ◽  
pp. ASN.2021040538
Author(s):  
Arthur M. Lee ◽  
Jian Hu ◽  
Yunwen Xu ◽  
Alison G. Abraham ◽  
Rui Xiao ◽  
...  

BackgroundUntargeted plasma metabolomic profiling combined with machine learning (ML) may lead to discovery of metabolic profiles that inform our understanding of pediatric CKD causes. We sought to identify metabolomic signatures in pediatric CKD based on diagnosis: FSGS, obstructive uropathy (OU), aplasia/dysplasia/hypoplasia (A/D/H), and reflux nephropathy (RN).MethodsUntargeted metabolomic quantification (GC-MS/LC-MS, Metabolon) was performed on plasma from 702 Chronic Kidney Disease in Children study participants (n: FSGS=63, OU=122, A/D/H=109, and RN=86). Lasso regression was used for feature selection, adjusting for clinical covariates. Four methods were then applied to stratify significance: logistic regression, support vector machine, random forest, and extreme gradient boosting. ML training was performed on 80% total cohort subsets and validated on 20% holdout subsets. Important features were selected based on being significant in at least two of the four modeling approaches. We additionally performed pathway enrichment analysis to identify metabolic subpathways associated with CKD cause.ResultsML models were evaluated on holdout subsets with receiver-operator and precision-recall area-under-the-curve, F1 score, and Matthews correlation coefficient. ML models outperformed no-skill prediction. Metabolomic profiles were identified based on cause. FSGS was associated with the sphingomyelin-ceramide axis. FSGS was also associated with individual plasmalogen metabolites and the subpathway. OU was associated with gut microbiome–derived histidine metabolites.ConclusionML models identified metabolomic signatures based on CKD cause. Using ML techniques in conjunction with traditional biostatistics, we demonstrated that sphingomyelin-ceramide and plasmalogen dysmetabolism are associated with FSGS and that gut microbiome–derived histidine metabolites are associated with OU.


2018 ◽  
Vol 48 (6) ◽  
pp. 447-455 ◽  
Author(s):  
Nilka Ríos Burrows ◽  
Joseph A. Vassalotti ◽  
Sharon H. Saydah ◽  
Rebecca Stewart ◽  
Monica Gannon ◽  
...  

Background: Most people with chronic kidney disease (CKD) are not aware of their condition. Objectives: To assess screening criteria in identifying a population with or at high risk for CKD and to determine their level of control of CKD risk factors. Method: CKD Health Evaluation Risk Information Sharing (CHERISH), a demonstration project of the Centers for Disease Control and Prevention, hosted screenings at 2 community locations in each of 4 states. People with diabetes, hypertension, or aged ≥50 years were eligible to participate. In addition to CKD, screening included testing and measures of hemoglobin A1C, blood pressure, and lipids. ­Results: In this targeted population, among 894 people screened, CKD prevalence was 34%. Of participants with diabetes, 61% had A1C < 7%; of those with hypertension, 23% had blood pressure < 130/80 mm Hg; and of those with high cholesterol, 22% had low-density lipoprotein < 100 mg/dL. Conclusions: Using targeted selection criteria and simple clinical measures, CHERISH successfully identified a population with a high CKD prevalence and with poor control of CKD risk factors. CHERISH may prove helpful to state and local programs in implementing CKD detection programs in their communities.


2020 ◽  
Vol 19 (01) ◽  
pp. 2040015
Author(s):  
Ahmad Alaiad ◽  
Hassan Najadat ◽  
Belal Mohsen ◽  
Khaled Balhaf

Background and objective: Chronic kidney disease (CKD) is one of the deadly diseases that can affect a lot of vital organs in the human body such as heart, liver, and lungs. Many individuals might be at early stage of kidney disease and not have any signs, which might lead to a sudden death. Previous research showed that early prediction of CKD is very important in the medical field for physicians’ decision-making and patients’ health and life. To this end, constructing an efficient prediction system for CKD, which is the goal of this paper, often reduces medical errors and overall healthcare cost. Methods: Classification and association rule mining techniques were integrated and utilised to construct an efficient system for predicting and diagnosing CKD and its causes using weka and SPSS as platform environments. In particular, five classification algorithms, namely, naive Bayes, decision tree, support vector machine, K-nearest neighbour, and JRip were used to achieve the research goal. In addition, Apriori algorithm was used to discover strong relationship rules between attributes. The experiments were conducted on real medical dataset collected from hospitals and patient monitoring systems. Results: The experiments achieved high accuracy of 98.50% for K-nearest neighbour (KNN) classifier and achieved 96.00% when using classier based on association rule (JRip). Conclusions: We conclude by showing that applying integrative approach by combining classification algorithms and association rule mining can significantly improve the classification accuracy and be more useful for CKD prediction. This research has also several theoretical and practical implications for the medical field and healthcare industry.


2020 ◽  
pp. BJGP.2020.0871
Author(s):  
Clare Elizabeth MacRae ◽  
Stewart Mercer ◽  
Bruce Guthrie

Background: Many drugs should be avoided or require dose-adjustment in chronic kidney disease (CKD). Previous estimates of potentially inappropriate prescribing rates have been based on data on a limited number of drugs and mainly in secondary care settings. Aim: To determine the prevalence of contraindicated and potentially inappropriate primary care prescribing in a complete population of people with CKD. Method: Cross-sectional study of prescribing patterns in a complete geographical population of people with CKD defined using laboratory data. Drugs were organised by British National Formulary advice. Contraindicated (CI) drugs: “avoid”. Potentially high risk (PHR) drugs: “avoid if possible”. Dose inappropriate (DI) drugs: dose exceeded recommended maximums. Results: 28,489 people with CKD were included in analysis, of whom 70.0% had CKD 3a, 22.4% CKD 3b, 5.9% CKD 4, and 1.5% CKD 5. 3.9% (95%CI 3.7-4.1) of people with CKD stages 3a-5 were prescribed one or more CI drug, 24.3% (95%CI 23.8-24.8) PHR drug, and 15.2% (95% CI 14.8-15.62) DI drug. CI drugs differed in prevalence by CKD stage, and were most commonly prescribed in CKD stage 4 with a prevalence of 36.0% (95%CI 33.7–38.2). PHR drugs were commonly prescribed in all CKD stages ranging from 19.4% (95%CI 17.6-21.3) in stage 4 to 25.1% (95%CI 24.5–25.7) in stage 3b. DI drugs were most commonly prescribed in stage 4, 26.4% (95%CI 24.3-28.6). Conclusion: Potentially inappropriate prescribing is common at all stages of CKD. Development and evaluation of interventions to improve prescribing safety in this high-risk populations are needed.


Sign in / Sign up

Export Citation Format

Share Document