Comparison of Machine Learning With Logistic Regression for Prediction of Chronic Kidney Disease in the Thai Adult Population

Ratchainant Thammasudjarit; Punnathorn Ingsathit; Sigit Ari  Saputro; Atiporn Ingsathit; Ammarin Thakkinstian

doi:10.33165/rmj.2021.44.4.250334

Chronic Kidney Disease Prediction using Machine Learning Algorithms

International Journal of Preventive Medicine and Health ◽

10.35940/ijpmh.c1010.071321 ◽

2021 ◽

Vol 1 (3) ◽

pp. 1-4

Author(s):

Kallu Samatha ◽

Muppidi Rohitha Reddy ◽

Pattan Faizal Khan ◽

Rayapati Akhil Chowdary ◽

P.V.R.D Prasada Rao

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Logistic Regression ◽

Random Forest ◽

Kidney Disease ◽

Decision Tree ◽

Kidney Diseases ◽

Disease Prediction ◽

Decision Tree Classifier ◽

Tree Classifier

Kidney diseases are increasing day by day among people. It is becoming a major health issue around the world. Not maintaining proper food habits and drinking less amount of water are one of the major reasons that contribute this condition. With this, it has become necessary to build up a system to foresee Chronic Kidney Diseases precisely. Here, we have proposed an approach for real time kidney disease prediction. Our aim is to find the best and efficient machine learning (ML) application that can effectively recognize and predict the condition of chronic kidney disease. We have used the data from UCI machine learning repository. In this work, five important machine learning classification techniques were considered for predicting chronic kidney disease which are KNN, Logistic Regression, Random Forest Classifier, SVM and Decision Tree Classifier. In this process, the data has been divided into two sections. In one section train dataset got trained and another section got evaluated by test dataset. The analysis results show that Decision Tree Classifier and Logistic Regression algorithms achieved highest performance than the other classifiers, obtaining the accuracy of 98.75% followed by random Forest, which stands at 97.5%.

Download Full-text

Chronic Kidney Disease Prediction using Machine Learning Algorithms

International Journal of Preventive Medicine and Health ◽

10.54105/ijpmh.c1010.071321 ◽

2021 ◽

pp. 1-4

Author(s):

Ms. Kallu Samatha ◽

◽

Ms. Muppidi Rohitha Reddy ◽

Mr. Pattan Faizal Khan ◽

Mr. Rayapati Akhil Chowdary ◽

...

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Logistic Regression ◽

Random Forest ◽

Kidney Disease ◽

Decision Tree ◽

Kidney Diseases ◽

Disease Prediction ◽

Decision Tree Classifier ◽

Tree Classifier

Kidney diseases are increasing day by day among people. It is becoming a major health issue around the world. Not maintaining proper food habits and drinking less amount of water are one of the major reasons that contribute this condition. With this, it has become necessary to build up a system to foresee Chronic Kidney Diseases precisely. Here, we have proposed an approach for real time kidney disease prediction. Our aim is to find the best and efficient machine learning (ML) application that can effectively recognize and predict the condition of chronic kidney disease. We have used the data from UCI machine learning repository. In this work, five important machine learning classification techniques were considered for predicting chronic kidney disease which are KNN, Logistic Regression, Random Forest Classifier, SVM and Decision Tree Classifier. In this process, the data has been divided into two sections. In one section train dataset got trained and another section got evaluated by test dataset. The analysis results show that Decision Tree Classifier and Logistic Regression algorithms achieved highest performance than the other classifiers, obtaining the accuracy of 98.75% followed by random Forest, which stands at 97.5%.

Download Full-text

Machine Learning Framework to Predict Chronic Kidney Disease using Ensemble Algorithm

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.d9107.069520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 1-6

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Random Forest ◽

Kidney Disease ◽

Decision Tree ◽

Performance Metrics ◽

Weighted Average ◽

Gradient Boosting ◽

Support Vector ◽

The Individual

Chronic Kidney Disease (CKD) is a worldwide concern that influences roughly 10% of the grown-up population on the world. For most of the people the early diagnosis of CKD is often not possible. Therefore, the utilization of present-day Computer aided supported strategies is important to help the conventional CKD finding framework to be progressively effective and precise. In this project, six modern machine learning techniques namely Multilayer Perceptron Neural Network, Support Vector Machine, Naïve Bayes, K-Nearest Neighbor, Decision Tree, Logistic regression were used and then to enhance the performance of the model Ensemble Algorithms such as ADABoost, Gradient Boosting, Random Forest, Majority Voting, Bagging and Weighted Average were used on the Chronic Kidney Disease dataset from the UCI Repository. The model was tuned finely to get the best hyper parameters to train the model. The performance metrics used to evaluate the model was measured using Accuracy, Precision, Recall, F1-score, Mathew`s Correlation Coefficient and ROC-AUC curve. The experiment was first performed on the individual classifiers and then on the Ensemble classifiers. The ensemble classifier like Random Forest and ADABoost performed better with 100% Accuracy, Precision and Recall when compared to the individual classifiers with 99.16% accuracy, 98.8% Precision and 100% Recall obtained from Decision Tree Algorithm

Download Full-text

Application of Data Mining Algorithms for Dementia in People with HIV/AIDS

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/4602465 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Luana Ibiapina Cordeiro Calíope Pinheiro ◽

Maria Lúcia Duarte Pereira ◽

Marcial Porto Fernandez ◽

Francisco Mardônio Vieira Filho ◽

Wilson Jorge Correia Pinto de Abreu ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Data Mining ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Learning Algorithms ◽

Principal Component ◽

Machine Learning Algorithms ◽

Hiv Aids

Dementia interferes with the individual’s motor, behavioural, and intellectual functions, causing him to be unable to perform instrumental activities of daily living. This study is aimed at identifying the best performing algorithm and the most relevant characteristics to categorise individuals with HIV/AIDS at high risk of dementia from the application of data mining. Principal component analysis (PCA) algorithm was used and tested comparatively between the following machine learning algorithms: logistic regression, decision tree, neural network, KNN, and random forest. The database used for this study was built from the data collection of 270 individuals infected with HIV/AIDS and followed up at the outpatient clinic of a reference hospital for infectious and parasitic diseases in the State of Ceará, Brazil, from January to April 2019. Also, the performance of the algorithms was analysed for the 104 characteristics available in the database; then, with the reduction of dimensionality, there was an improvement in the quality of the machine learning algorithms and identified that during the tests, even losing about 30% of the variation. Besides, when considering only 23 characteristics, the precision of the algorithms was 86% in random forest, 56% logistic regression, 68% decision tree, 60% KNN, and 59% neural network. The random forest algorithm proved to be more effective than the others, obtaining 84% precision and 86% accuracy.

Download Full-text

Comparative analysis of machine learning algorithms in water extraction

Journal of Physics Conference Series ◽

10.1088/1742-6596/2076/1/012045 ◽

2021 ◽

Vol 2076 (1) ◽

pp. 012045

Author(s):

Aimin Li ◽

Meng Fan ◽

Guangduo Qin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Logistic Regression ◽

Comparative Analysis ◽

Random Forest ◽

Decision Tree ◽

Water Body ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector

Abstract There are many traditional methods available for water body extraction based on remote sensing images, such as normalised difference water index (NDWI), modified NDWI (MNDWI), and the multi-band spectrum method, but the accuracy of these methods is limited. In recent years, machine learning algorithms have developed rapidly and been applied widely. Using Landsat-8 images, models such as decision tree, logistic regression, a random forest, neural network, support vector method (SVM), and Xgboost were adopted in the present research within machine learning algorithms. Based on this, through cross validation and a grid search method, parameters were determined for each model.Moreover, the merits and demerits of several models in water body extraction were discussed and a comparative analysis was performed with three methods for determining thresholds in the traditional NDWI. The results show that the neural network has excellent performances and is a stable model, followed by the SVM and the logistic regression algorithm. Furthermore, the ensemble algorithms including the random forest and Xgboost were affected by sample distribution and the model of the decision tree returned the poorest performance.

Download Full-text

A Goal Programming-Based Methodology for Machine Learning Model Selection Decisions: A Predictive Maintenance Application

Mathematics ◽

10.3390/math9192405 ◽

2021 ◽

Vol 9 (19) ◽

pp. 2405

Author(s):

Ioannis Mallidis ◽

Volha Yakavenka ◽

Anastasios Konstantinidis ◽

Nikolaos Sariannidis

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Goal Programming ◽

Regression Models ◽

Support Vector ◽

Threshold Values ◽

Time Efficiency ◽

The Neural Network

The paper develops a goal programming-based multi-criteria methodology, for assessing different machine learning (ML) regression models under accuracy and time efficiency criteria. The developed methodology provides users with high flexibility in assessing the models as it allows for a fast and computationally efficient sensitivity analysis of accuracy and time significance weights as well as accuracy and time significance threshold values. Four regression models were assessed, namely the decision tree, random forest, support vector and the neural network. The developed methodology was employed to forecast the time to failures of NASA Turbofans. The results reveal that decision tree regression (DTR) seems to be preferred for low values of accuracy weights (up to 30%) and low accuracy and time efficiency threshold values. As the accuracy weights tend to increase and for higher accuracy and time efficiency threshold values, random forest regression (RFR) seems to be the best choice. The preference for the RFR model however, seems to change towards the adoption of the neural network for accuracy weights equal to and higher than 90%.

Download Full-text

SUPERVISED MACHINE LEARNING ALGORITHMS FOR DETECTING CREDIT CARD FRAUD

EPRA International Journal of Research & Development (IJRD) ◽

10.36713/epra7636 ◽

2021 ◽

pp. 131-134

Author(s):

G.Bhargav Chowdari

Keyword(s):

Neural Network ◽

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Supervised Learning ◽

Credit Card ◽

Learning Algorithms ◽

Learning Methods ◽

Credit Card Fraud

One of the most serious ethical challenges in the credit card industry is fraud. Our paper’s major goal is to identify credit card theft and offer a reasonable solution to the problem. Credit card fraud has cost customers and banks billions of dollars around the world. Fraudsters are constantly attempting to come up with new ways and tricks to commit fraud, despite the fact that there are several measures in place to prevent it. Fraud detection is extremely important in the banking and finance industries. For detection purposes, we will use an artificial neural network. As a result, in order to prevent it, we will develop a system that will not only detect fraud, but will also detect it before it occurs. In order to detect new scams, our system will learn from previous frauds. Mining algorithms were used to detect fraud, but they failed miserably. We use machine learning methods to detect fraud in credit card transactions in our paper. The research employs supervised learning methods that are applied to a kaggle dataset that is severely skewed and imbalanced. We used robust scalar to balance the set, resulting in 51 percent non-fraud cases and 49 percent fraud ones. Logistic regression, random forest, decision tree, and KNN have all been implemented, with additional learning curves displaying which algorithm performs best. Accuracy, specificity, precision, and sensitivity are the evaluation criteria, and a comparative chart is created to show the comparative analysis of various supervised learning algorithms. KEYWORDS: KNN,Neural network,Logistic regression,Random forest,Decision tree

Download Full-text

Comparison of Neural Network and Machine Learning Approaches in Prediction of Chronic Kidney Disease

Journal of Student Research ◽

10.47611/jsrhs.v10i3.1570 ◽

2021 ◽

Vol 10 (3) ◽

Author(s):

Shreya Nag ◽

Nimitha Jammula

Keyword(s):

Neural Network ◽

Machine Learning ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Healthcare Providers ◽

Accurate Diagnosis ◽

Supervised Machine Learning ◽

Learning Approaches ◽

Data Set ◽

The Neural Network

The diagnosis of a disease to determine a specific condition is crucial in caring for patients and furthering medical research. The timely and accurate diagnosis can have important implications for both patients and healthcare providers. An earlier diagnosis allows doctors to consider more methods of treatment, allowing them to have a greater flexibility of tailoring their decisions, and ultimately improving the patient’s health. Additionally, a timely detection allows patients to have a greater control over their health and their decisions, allowing them to plan ahead. As advancements in computer science and technology continue to improve, these two factors can play a major role in aiding healthcare providers with medical issues. The emergence of artificial intelligence and machine learning can aid in addressing the challenge of completing timely and accurate diagnosis. The goal of this research work is to design a system that utilizes machine learning and neural network techniques to diagnose chronic kidney disease with more than 90% accuracy based on a clinical data set, and to do a comparative study of the performance of the neural network versus supervised machine learning approaches. Based on the results, all the algorithms performed well in prediction of chronic kidney disease (CKD) with more that 90% accuracy. The neural network system provided the best performance (accuracy = 100%) in prediction of chronic kidney disease in comparison with the supervised Random Forest algorithm (accuracy = 99%) and the supervised Decision Tree algorithm (accuracy = 97%).

Download Full-text

Machine Learning in Aging: An Example of Developing Prediction Models for Serious Fall Injury in Older Adults

Innovation in Aging ◽

10.1093/geroni/igaa057.859 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 268-269

Author(s):

Jaime Speiser ◽

Kathryn Callahan ◽

Jason Fanning ◽

Thomas Gill ◽

Anne Newman ◽

...

Keyword(s):

Machine Learning ◽

Older Adults ◽

Random Forest ◽

Decision Tree ◽

Prediction Models ◽

Receiver Operating Curve ◽

Learning Methods ◽

Life Study ◽

Fall Injury ◽

Machine Learning Methods

Abstract Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty understanding the complex algorithms behind models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated in data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Machine learning methods may offer improved performance compared to traditional models for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.

Download Full-text

Machine-learning algorithm as a prognostic tool in non-obstructive acute-on-chronic kidney disease in the cat

Journal of Feline Medicine and Surgery ◽

10.1177/1098612x211001273 ◽

2021 ◽

pp. 1098612X2110012

Author(s):

Jade Renard ◽

Mathieu R Faucher ◽

Anaïs Combes ◽

Didier Concordet ◽

Brice S Reynolds

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Decision Tree ◽

Diagnostic Performance ◽

Clinical Signs ◽

Learning System ◽

Clinical Severity ◽

Medium Term ◽

Term Survival

Objectives The aim of this study was to develop an algorithm capable of predicting short- and medium-term survival in cases of intrinsic acute-on-chronic kidney disease (ACKD) in cats. Methods The medical record database was searched to identify cats hospitalised for acute clinical signs and azotaemia of at least 48 h duration and diagnosed to have underlying chronic kidney disease based on ultrasonographic renal abnormalities or previously documented azotaemia. Cases with postrenal azotaemia, exposure to nephrotoxicants, feline infectious peritonitis or neoplasia were excluded. Clinical variables were combined in a clinical severity score (CSS). Clinicopathological and ultrasonographic variables were also collected. The following variables were tested as inputs in a machine learning system: age, body weight (BW), CSS, identification of small kidneys or nephroliths by ultrasonography, serum creatinine at 48 h (Crea48), spontaneous feeding at 48 h (SpF48) and aetiology. Outputs were outcomes at 7, 30, 90 and 180 days. The machine-learning system was trained to develop decision tree algorithms capable of predicting outputs from inputs. Finally, the diagnostic performance of the algorithms was calculated. Results Crea48 was the best predictor of survival at 7 days (threshold 1043 µmol/l, sensitivity 0.96, specificity 0.53), 30 days (threshold 566 µmol/l, sensitivity 0.70, specificity 0.89) and 90 days (threshold 566 µmol/l, sensitivity 0.76, specificity 0.80), with fewer cats still alive when their Crea48 was above these thresholds. A short decision tree, including age and Crea48, predicted the 180-day outcome best. When Crea48 was excluded from the analysis, the generated decision trees included CSS, age, BW, SpF48 and identification of small kidneys with an overall diagnostic performance similar to that using Crea48. Conclusions and relevance Crea48 helps predict short- and medium-term survival in cats with ACKD. Secondary variables that helped predict outcomes were age, CSS, BW, SpF48 and identification of small kidneys.

Download Full-text