HEART DISEASE PREDICTION WITH LOGISTIC REGRESSION AND RANDOM FOREST MODEL

Author(s):  
D. Tang
Author(s):  
Prof. R. A. Jamadar ◽  
Aarati Garje ◽  
Tejasvi Bhorde ◽  
Vaishnavi Jadhav

Heart disease is one amongst the key causes of death now-a-days. Prediction of the center sickness is troublesome, time overwhelming and expensive, therefore we tend to try to beat it. This analysis is to assist individuals, as we all know prediction of upset may be a vital challenge and it’s expensive that most of the individuals can’t afford and lacking behind due to these, therefore to assist them for obtaining done this tests in low value, we tend to try to develop cardiovascular disease prediction system victimization machine learning. As there square measure several systems designed for machine-controlled coronary failure testing however it's some drawbacks like over fitting that we tend to try to beat in our system and implementing system which is able to show smart performance and have high accuracy as compared to alternative systems. Experiment is performed victimization on-line clinical coronary failure dataset. The projected methodology is a smaller amount complicated with high accuracy of report. They contributes towards study square measure as follows: one. AN intelligent learning system RSA-RF is projected for the machine-controlled detection of coronary failure. The projected RSA-RF model was projected and developed for the primary time for the center failure detection. Previously, RSA algorithms have shown winning applications in looking best hyper parameters of a model. This paper presents its application in looking best set of options. 2. The developed learning system improves coronary failure prediction of typical random forest model by three.3% and shows higher performance than eleven recently projected strategies and alternative state of the art machine learning models for coronary failure detection. Moreover, the projected methodology shows lower time complexness because it reduces the amount of options[1].


2021 ◽  
Author(s):  
Jaishri Pandhari Wankhede ◽  
Palaniappan S ◽  
Magesh Kumar S

The objective of the paper is to throw light on few existing heart disease predicting approaches and proposes a Hybrid Random Forest Model Integrated with Linear Model (HRFMILM) for predicting and identifying the HDs at an early stage. Even though the linear model has simple estimation procedure, it is very sensitive to outliers and may lead to overfitting process. On the other hand, averaging in Random Forest Model (RFM) improves the overall accuracy and reduces the possibility of overfitting. The dataset is collected from standard UCI repository. Experimental results concluded that the integration of Linear Model with RFM makes the simple estimation procedure with improved overall accuracy than the respective models. Further, the proposed method compares the prediction performance of few existing approaches in terms of parameters, namely, precision, recall and F1-score.


Author(s):  
Aamir Khan ◽  
Dr. Sanjay Jain

The data mining (DM) is a process that deals with mining of valuable information from the rough data. The method of prediction analysis (PA) is implemented for predicting the future possibilities on the basis of current information. This research work is planned on the basis of predicting the heart disease. The coronary disorder can be forecasted in different phases in which pre-processing is done, attributes are extracted and classification is performed. The hybrid method is introduced on the basis of RF and LR.The Random Forest classification is adopted to extract the attributes and the classification process is carried out using logistic regression. The analysis of performance of introduced system is done with regard to accuracy, precision and recall. It is indicated that the introduced system will be provided accuracy approximately above 90% while predicting the heart disease.


2020 ◽  
Vol 35 (Supplement_3) ◽  
Author(s):  
Manuel Benítez Sánchez ◽  
Guillermo Martín ◽  
Luis Gil Sacaluga ◽  
Maria Jose Garcia Cortes ◽  
Sergio García Marcos ◽  
...  

Abstract Background and Aims Random Forest (RF) is an analytical technique of Artificial Intelligence (AI) that consists of an assembly of trees built by bootstrapping (resampling with replacement). In each node a subset of predictor variables is selected and for them the best cut point is determined. Each division of the tree is based on a random sample of the predictors. The trees are as long as possible. In the construction of each RF tree a part of the observations is not used (37% approx.). It is called an out-of-bag (OOB) sample and is used to obtain an honest estimate of the predictive capacity of the model. So it does not require validation. In each analysis, a few hundred Regression or classification trees are carried out, depending on whether the response variable is numerical or qualitative respectively. The result is an average of the repeated predictions of the model (Bagging). RF allows to calculate the importance of the predictor variables, which can be used later to be included in a multivariate regression model. Method We analyzed 14750 records between 2011 and 2014 contained in Information System of the Autonomous Transplant Coordination of Andalusia (SICATA) a system that includes clinical-epidemiological variables, about anemia, bone bone metabolism, adequacy of dialysis and vascular access. 1911 patients presented the event of interest (exitus). Three predictive and explanatory models of survival are developed: 1-RF. 2-.Multivariate Logistic Regression. 3- Multivariate Logistic Regression that includes the important variables of the previous RF model. We compare them in terms of accuracy (AUC of the ROC curve). Results AUC of the ROC curve of the multivariate model without prior RF was: 0.75 AUC of the ROC curve of the multivariate model with previous RF was: 0.81. AUC of the ROC curve of the Random Forest model: 0.98 Conclusion The Random Forest model has a 98% discrimination in the mortality of patients on Hemodialysis, far superior to the classic multivariate analyzes. The Multivariate Logistic Regression performed with the important RF variables improves the AUC of the previous model 0.81 vs. 0.75.


Author(s):  
Soo-Kyoung Lee ◽  
Juh Hyun Shin ◽  
Jinhyun Ahn ◽  
Ji Yeon Lee ◽  
Dong Eun Jang

Background: Machine learning (ML) can keep improving predictions and generating automated knowledge via data-driven predictors or decisions. Objective: The purpose of this study was to compare different ML methods including random forest, logistics regression, linear support vector machine (SVM), polynomial SVM, radial SVM, and sigmoid SVM in terms of their accuracy, sensitivity, specificity, negative predictor values, and positive predictive values by validating real datasets to predict factors for pressure ulcers (PUs). Methods: We applied representative ML algorithms (random forest, logistic regression, linear SVM, polynomial SVM, radial SVM, and sigmoid SVM) to develop a prediction model (N = 60). Results: The random forest model showed the greatest accuracy (0.814), followed by logistic regression (0.782), polynomial SVM (0.779), radial SVM (0.770), linear SVM (0.767), and sigmoid SVM (0.674). Conclusions: The random forest model showed the greatest accuracy for predicting PUs in nursing homes (NHs). Diverse factors that predict PUs in NHs including NH characteristics and residents’ characteristics were identified according to diverse ML methods. These factors should be considered to decrease PUs in NH residents.


2021 ◽  
Vol 15 (Supplement_1) ◽  
pp. S214-S214
Author(s):  
A Levartovsky ◽  
Y Barash ◽  
S Ben-Horin ◽  
B Ungar ◽  
E Klang ◽  
...  

Abstract Background Intra-abdominal abscess is an important clinical complication of Crohn’s disease (CD), which can be diagnosed using computed tomography (CT) or magnetic resonance imaging (MRI). However, a high index of clinical suspicion is needed to diagnose an abscess as abdominal imaging is not routinely used during hospital admission. This study aimed to identify clinical predictors of an intra-abdominal abscess among hospitalized patients with CD. Methods We created an electronic data repository of all patients with CD who visited the emergency department (ED) of our tertiary medical center between 2012 and 2018. Data included tabular demographic and clinical variables, as well as CT and MRI imaging outcomes. We searched the data repository for the presence of an abscess on abdominal imaging within seven days from the ED visit. Machine learning models were trained to predict the presence of an abscess. A logistic regression model was compared to a random forest model. The area under the receiver operator curve (AUC) was used as a metric. To establish statistical significance, bootstrapping of 100 experiments with random 80/20 training/testing splits was performed. We included only patients who were hospitalized due to complaints that can be attributed to CD exacerbation. Patients presenting within 30 days from an abdominal surgery were excluded. Results Overall, 1556 patients with CD visited the ED, of those 555 patients with a CD exacerbation. Of them, 339 patients were hospitalized and underwent abdominal imaging within 7 days from the ED visit. Forty-two patients (12.1%) were diagnosed with an abscess on abdominal imaging. The average length of the abscess was 32 mm (IQR 21.5, 43.5), mainly in the mesentery adjacent to the small bowel (38.1%). On multivariate analysis, high CRP values (64.97 mg/L, aOR 14.42 [95% CI 4.93–42.13]), high platelet count (322.5 K/microL, aOR 4.01 [95% CI 1.97–8.15]), leukocytosis (10.55 K/microL, aOR 3.83 [95% CI 1.71–8.56]) and higher heart rate (over 87.5 beats per minute, aOR 2.58 [95% CI 1.22–5.46]) were independently associated with an intra-abdominal abscess. Overall, random forest and logistic regression showed similar performance. The random forest model showed an AUC of 0.824±0.065 with eight features (CRP, Hemoglobin, WBC, age, current biologic medical treatment, BUN, current immunomodulatory medical treatment, gender). Conclusion In our large tertiary center cohort, the machine-learning model identified features associated with the presentation of an intra-abdominal abscess. Such a decision support tool may assist in triaging CD patients for imaging to exclude this potentially life-threatening complication.


Sign in / Sign up

Export Citation Format

Share Document