Diagnosis Rule Extraction from Patient Data for Chronic Kidney Disease Using Machine Learning
This research study employed a machine learning algorithm on actual patient data to extract decision making rules that can be used to diagnose chronic kidney disease. The patient data set entails a number of health-related attributes or indicators and contains 250 patients positive for chronic kidney disease. The C4.5 decision tree algorithm was applied to the patient data to formulate a set of diagnosis rules for chronic kidney disease. The C4.5 algorithm utilizing 3-fold cross validation achieved 98.25% prediction accuracy and thus correctly classified 393 instances and incorrectly classified 7 instances for a total patient count of 400. The extracted rule set highlighted the need to monitor serum creatinine levels in patients as the primary indicator for the presence of disease. Secondary indicators were pedal edema, hemoglobin, diabetes mellitus and specific gravity. The set of rules provides a preliminary screening tool towards conclusive diagnosis of the chronic kidney disease by nephrologists following timely referral by the primary care providers or decision-making algorithms.