A Decision Tree on Data Mining Framework for Recognition of Chronic Kidney Disease

Author(s):  
Ravindra B. V. ◽  
Sriraam N. ◽  
Geetha M.

The term chronic kidney disease (CKD) refers to the malfunction of the kidney and its failure to remove toxins and other waste products from blood. Typical symptoms of CKD include color change in urine, swelling due to fluids staying in tissue, itching, flank pain, and fatigue. Timely intervention is essential for early recognition of CKD as it affects more than 10 million people in India. This chapter suggests a decision tree-based data mining framework to recognize CKD from Non chronic kidney disease (NCKD). Data sets derived from open source UCI repository was considered. Unlike earlier reported work, this chapter applies the decision rules based on the clustered data through k-means clustering process. Four cluster groups were identified and j48 pruned decision tree-based automated rules were formatted. The performance of the proposed framework was evaluated in terms of sensitivity, specificity, precision, and recall. A new quantitative measure, relative performance, and MCC were introduced which confirms the suitability of the proposed framework for recognition of CKD from NCKD.

Author(s):  
Avijit Kumar Chaudhuri ◽  
Deepankar Sinha ◽  
Dilip K. Banerjee ◽  
Anirban Das

2021 ◽  
pp. 1098612X2110012
Author(s):  
Jade Renard ◽  
Mathieu R Faucher ◽  
Anaïs Combes ◽  
Didier Concordet ◽  
Brice S Reynolds

Objectives The aim of this study was to develop an algorithm capable of predicting short- and medium-term survival in cases of intrinsic acute-on-chronic kidney disease (ACKD) in cats. Methods The medical record database was searched to identify cats hospitalised for acute clinical signs and azotaemia of at least 48 h duration and diagnosed to have underlying chronic kidney disease based on ultrasonographic renal abnormalities or previously documented azotaemia. Cases with postrenal azotaemia, exposure to nephrotoxicants, feline infectious peritonitis or neoplasia were excluded. Clinical variables were combined in a clinical severity score (CSS). Clinicopathological and ultrasonographic variables were also collected. The following variables were tested as inputs in a machine learning system: age, body weight (BW), CSS, identification of small kidneys or nephroliths by ultrasonography, serum creatinine at 48 h (Crea48), spontaneous feeding at 48 h (SpF48) and aetiology. Outputs were outcomes at 7, 30, 90 and 180 days. The machine-learning system was trained to develop decision tree algorithms capable of predicting outputs from inputs. Finally, the diagnostic performance of the algorithms was calculated. Results Crea48 was the best predictor of survival at 7 days (threshold 1043 µmol/l, sensitivity 0.96, specificity 0.53), 30 days (threshold 566 µmol/l, sensitivity 0.70, specificity 0.89) and 90 days (threshold 566 µmol/l, sensitivity 0.76, specificity 0.80), with fewer cats still alive when their Crea48 was above these thresholds. A short decision tree, including age and Crea48, predicted the 180-day outcome best. When Crea48 was excluded from the analysis, the generated decision trees included CSS, age, BW, SpF48 and identification of small kidneys with an overall diagnostic performance similar to that using Crea48. Conclusions and relevance Crea48 helps predict short- and medium-term survival in cats with ACKD. Secondary variables that helped predict outcomes were age, CSS, BW, SpF48 and identification of small kidneys.


2021 ◽  
pp. 1826-1839
Author(s):  
Sandeep Adhikari, Dr. Sunita Chaudhary

The exponential growth in the use of computers over networks, as well as the proliferation of applications that operate on different platforms, has drawn attention to network security. This paradigm takes advantage of security flaws in all operating systems that are both technically difficult and costly to fix. As a result, intrusion is used as a key to worldwide a computer resource's credibility, availability, and confidentiality. The Intrusion Detection System (IDS) is critical in detecting network anomalies and attacks. In this paper, the data mining principle is combined with IDS to efficiently and quickly identify important, secret data of interest to the user. The proposed algorithm addresses four issues: data classification, high levels of human interaction, lack of labeled data, and the effectiveness of distributed denial of service attacks. We're also working on a decision tree classifier that has a variety of parameters. The previous algorithm classified IDS up to 90% of the time and was not appropriate for large data sets. Our proposed algorithm was designed to accurately classify large data sets. Aside from that, we quantify a few more decision tree classifier parameters.


2012 ◽  
pp. 163-186
Author(s):  
Jirí Krupka ◽  
Miloslava Kašparová ◽  
Pavel Jirava ◽  
Jan Mandys

The chapter presents the problem of quality of life modeling in the Czech Republic based on classification methods. It concerns a comparison of methodological approaches; in the first case the approach of the Institute of Sociology of the Academy of Sciences of the Czech Republic was used, the second case is concerning a project of the civic association Team Initiative for Local Sustainable Development. On the basis of real data sets from the institute and team initiative the authors synthesized and analyzed quality of life classification models. They used decision tree classification algorithms for generating transparent decision rules and compare the classification results of decision tree. The classifier models on the basis of C5.0, CHAID, C&RT and C5.0 boosting algorithms were proposed and analyzed. The designed classification model was created in Clementine.


2021 ◽  
pp. 947-957
Author(s):  
Hasin Shahed Shad ◽  
Zeeshan Jamal ◽  
S. M. Foysal Ahmed ◽  
Sifat Momen ◽  
Nafees Mansoor

2020 ◽  
Vol 9 (2) ◽  
pp. 403 ◽  
Author(s):  
Cheng-Sheng Yu ◽  
Chang-Hsien Lin ◽  
Yu-Jiun Lin ◽  
Shiyng-Yu Lin ◽  
Sen-Te Wang ◽  
...  

Background: Preventive medicine and primary health care are essential for patients with chronic kidney disease (CKD) because the symptoms of CKD may not appear until the renal function is severely compromised. Early identification of the risk factors of CKD is critical for preventing kidney damage and adverse outcomes. Early recognition of rapid progression to advanced CKD in certain high-risk populations is vital. Methods: This is a retrospective cohort study, the population screened and the site where the study has been performed. Multivariate statistical analysis was used to assess the prediction of CKD as many potential risk factors are involved. The clustering heatmap and random forest provides an interactive visualization for the classification of patients with different CKD stages. Results: uric acid, blood urea nitrogen, waist circumference, serum glutamic oxaloacetic transaminase, and hemoglobin A1c (HbA1c) were significantly associated with CKD. CKD was highly associated with obesity, hyperglycemia, and liver function. Hypertension and HbA1c were in the same cluster with a similar pattern, whereas high-density lipoprotein cholesterol had an opposite pattern, which was also verified using heatmap. Early staged CKD patients who are grouped into the same cluster as advanced staged CKD patients could be at high risk for rapid decline of kidney function and should be closely monitored. Conclusions: The clustering heatmap provided a new predictive model of health care management for patients at high risk of rapid CKD progression. This model could help physicians make an accurate diagnosis of this progressive and complex disease.


2016 ◽  
Vol 6 (2) ◽  
pp. 83-87 ◽  
Author(s):  
Shahram Tahmasebian ◽  
Marjan Ghazisaeedi ◽  
Mostafa Langarizadeh ◽  
Mehrshad Mokhtaran ◽  
Mitra Mahdavi-Mazdeh ◽  
...  

2021 ◽  
Vol 10 (3) ◽  
pp. 121-127
Author(s):  
Bareen Haval ◽  
Karwan Jameel Abdulrahman ◽  
Araz Rajab

This article presents the results of connecting an educational data mining techniques to the academic performance of students. Three classification models (Decision Tree, Random Forest and Deep Learning) have been developed to analyze data sets and predict the performance of students. The projected submission of the three classificatory was calculated and matched. The academic history and data of the students from the Office of the Registrar were used to train the models. Our analysis aims to evaluate the results of students using various variables such as the student's grade. Data from (221) students with (9) different attributes were used. The results of this study are very important, provide a better understanding of student success assessments and stress the importance of data mining in education. The main purpose of this study is to show the student successful forecast using data mining techniques to improve academic programs. The results of this research indicate that the Decision Tree classifier overtakes two other classifiers by achieving a total prediction accuracy of 97%.


2021 ◽  
Vol 22 (2) ◽  
pp. 119-134
Author(s):  
Ahad Shamseen ◽  
Morteza Mohammadi Zanjireh ◽  
Mahdi Bahaghighat ◽  
Qin Xin

Data mining is the extraction of information and its roles from a vast amount of data. This topic is one of the most important topics these days. Nowadays, massive amounts of data are generated and stored each day. This data has useful information in different fields that attract programmers’ and engineers’ attention. One of the primary data mining classifying algorithms is the decision tree. Decision tree techniques have several advantages but also present drawbacks. One of its main drawbacks is its need to reside its data in the main memory. SPRINT is one of the decision tree builder classifiers that has proposed a fix for this problem. In this paper, our research developed a new parallel decision tree classifier by working on SPRINT results. Our experimental results show considerable improvements in terms of the runtime and memory requirements compared to the SPRINT classifier. Our proposed classifier algorithm could be implemented in serial and parallel environments and can deal with big data. ABSTRAK: Perlombongan data adalah pengekstrakan maklumat dan peranannya dari sejumlah besar data. Topik ini adalah salah satu topik yang paling penting pada masa ini. Pada masa ini, data yang banyak dihasilkan dan disimpan setiap hari. Data ini mempunyai maklumat berguna dalam pelbagai bidang yang menarik perhatian pengaturcara dan jurutera. Salah satu algoritma pengkelasan perlombongan data utama adalah pokok keputusan. Teknik pokok keputusan mempunyai beberapa kelebihan tetapi kekurangan. Salah satu kelemahan utamanya adalah keperluan menyimpan datanya dalam memori utama. SPRINT adalah salah satu pengelasan pembangun pokok keputusan yang telah mengemukakan untuk masalah ini. Dalam makalah ini, penyelidikan kami sedang mengembangkan pengkelasan pokok keputusan selari baru dengan mengusahakan hasil SPRINT. Hasil percubaan kami menunjukkan peningkatan yang besar dari segi jangka masa dan keperluan memori berbanding dengan pengelasan SPRINT. Algoritma pengklasifikasi yang dicadangkan kami dapat dilaksanakan dalam persekitaran bersiri dan selari dan dapat menangani data besar.


Sign in / Sign up

Export Citation Format

Share Document