Optimized Random Forest Algorithm with Parameter Tuning for Predicting Heart Disease

2021 ◽  
pp. 443-451
Author(s):  
Ajil D. S. Vins ◽  
W. R. Sam Emmanuel
Author(s):  
Indu Yekkala ◽  
Sunanda Dixit

Data is generated by the medical industry. Often this data is of very complex nature—electronic records, handwritten scripts, etc.—since it is generated from multiple sources. Due to the Complexity and sheer volume of this data necessitates techniques that can extract insight from this data in a quick and efficient way. These insights not only diagnose the diseases but also predict and can prevent disease. One such use of these techniques is cardiovascular diseases. Heart disease or coronary artery disease (CAD) is one of the major causes of death all over the world. Comprehensive research using single data mining techniques have not resulted in an acceptable accuracy. Further research is being carried out on the effectiveness of hybridizing more than one technique for increasing accuracy in the diagnosis of heart disease. In this article, the authors worked on heart stalog dataset collected from the UCI repository, used the Random Forest algorithm and Feature Selection using rough sets to accurately predict the occurrence of heart disease


2021 ◽  
Vol 5 (1) ◽  
pp. 61-69
Author(s):  
Ievgen Nastenko ◽  
Vitaliy Maksymenko ◽  
Sergiy Potashev ◽  
Volodymyr Pavlov ◽  
Vitalii Babenko ◽  
...  

Background. Recent studies show that cardiovascular diseases, including coronary heart disease, are the leading causes of death and one of the main factors of disability worldwide. The detection of cases of this type of disease over the past 30 years has increased from 271 million to 523 million and the number of deaths – from 12.1 million to 18.6 million. Cardiovascular diseases are the main cause of death among the population of Ukraine and, according to this indicator, the country remains one of the world leaders. Coronary heart disease is the leading factor in the loss of health in Ukraine and modern diagnostic methods, including machine learning algorithms, are increasingly being used for timely detection. Objective. According to the data of speckle-tracking echocardiography using the random forest method, construct classification algorithms for diagnosing violations of the kinematics of left ventricular contractions in patients with coronary heart disease at rest, and when using an echostress test with a dobutamine test. Methods. Speckle-tracking echocardiography was used to examine 40 patients with coronary heart disease and 16 in whom no cardiac pathology was found. Echocardiography was recorded in B mode in three positions: along the long axis, in 4-chamber, and 2-chamber positions. In total, 6245 frames of the video stream were used: 1871 – without cardiac abnormalities, and 4374 – in the presence of pathology during the examination. 56 patients (2509 frames of video data) were examined without the use of a dobutamine test and 38 patients (3736 frames of video data) – using an echostress test with a dobutamine test if no disturbances were found at rest. Dobutamine doses of 10, 20, and 40 mcg were administered under the supervision of an anesthesiologist. The data of texture analysis of images were used as informative features. To build an algorithm for detecting coronary heart disease the random forest algorithm was applied. Results. At the first stage of the study, the diagnostic algorithms norma–pathology for the state of rest and dobutamine doses of 10, 20, and 40 mcg were constructed. Before applying the algorithm the samples were randomly divided into training (70%) and test (30%). The classifiers were evaluated for accuracy, sensitivity, and specificity. According to the test samples, the accuracy of diagnostic conclusions varied from 97 to 99%. At the second stage of the study, to increase the versatility of the models, the classifier was built for all images, without dividing them into dobutamine doses. The accuracy for the test samples also ranged from 96.6 to 97.8%. To construct diagnostic algorithms by the random forest method the data of texture analysis of images were used. Conclusions. High-precision classification models were obtained using the random forest algorithm. The developed models can be applied to the analysis of echocardiograms obtained in B mode on equipment that is not equipped with the speckle tracking technology.


Nowadays, heart disease is the main cause of several deaths among all other diseases. Due to the lack of resources in the medical field, the prediction of heart diseases becomes a major problem. For early diagnosis and treatment, some classification algorithms such as Decision Tree and Random Forest Algorithm are used. The data mining techniques compare the accuracy of the algorithm and predict heart diseases. The main aim of this paper is to predict heart disease based on the dataset values. In this paper we are comparing the accuracy of above two algorithms. To implement these methods the following steps are used. In first phase, a dataset of 13 attributes is collected and it was applied on classification techniques using the Decision tree and Random Forest Algorithms. Finally, the accuracy is collected for both the algorithms. In this paper we observed that random forest is generating better results than decision tree in prediction of heart diseases.


Author(s):  
Halima EL Hamdaoui ◽  
Said Boujraf ◽  
Nour El Houda Chaoui ◽  
Badr Alami ◽  
Mustapha Maaroufi

heart disease is a major cause of death worldwide. Thus, diagnosis and prediction of heart disease remain mandatory. Clinical decision support systems based on machine learning techniques have become the primary tool to assist clinicians and contribute to automated diagnosis. This paper aims to predict heart disease using Random Forest algorithm enhanced with the boosting algorithm Adaboost. The model is trained and tested on University of California Irvine (UCI) Cleveland and Statlog heart disease datasets using the most relevant features 14 attributes. The result shows that Random Forest algorithm combined with AdaBoost algorithm achieved higher accuracy than applying only Radom Forest algorithm, 96.16%, 95.98%, respectively. We compare our suggested model to report machine learning classifiers. Indeed, the obtained result is supporting the efficiency and validity of our model. Besides, the proposed model achieved high accuracy compared to existing studies in the literature that confirmed that a clinical decision support system could be used to predict heart disease based on machine learning algorithms.


Author(s):  
Indu Yekkala ◽  
Sunanda Dixit

Data is generated by the medical industry. Often this data is of very complex nature—electronic records, handwritten scripts, etc.—since it is generated from multiple sources. Due to the Complexity and sheer volume of this data necessitates techniques that can extract insight from this data in a quick and efficient way. These insights not only diagnose the diseases but also predict and can prevent disease. One such use of these techniques is cardiovascular diseases. Heart disease or coronary artery disease (CAD) is one of the major causes of death all over the world. Comprehensive research using single data mining techniques have not resulted in an acceptable accuracy. Further research is being carried out on the effectiveness of hybridizing more than one technique for increasing accuracy in the diagnosis of heart disease. In this article, the authors worked on heart stalog dataset collected from the UCI repository, used the Random Forest algorithm and Feature Selection using rough sets to accurately predict the occurrence of heart disease


2019 ◽  
Vol 13 ◽  
Author(s):  
Nandhini Abirami R. ◽  
Durai Raj Vincent

Background: Diagnosing diseases is an intricate job in medical field. Machine learning when applied to health care is capable of early detection of disease which would aid to provide early medical intervention. In heart disease prediction, machine learning techniques have played a significant role. Analysis of disease has become vital in health care sectors. The massive data collected by healthcare sectors are preprocessed and analyzed to discover the underlying information in the data for effective decision making and to provide proper medical intervention. The success of machine learning in medical industry is its capability in analyzing the huge amount of data gathered by the health sector and its effectiveness in decision making. Since medical field involves too many manual processes it has become necessary to automate these procedures. Remarkable advancements in electronic medical records have made it possible. Diagnosing diseases is an intricate job in medical field. Objective: The objective of this research is to design a robust machine learning algorithm to predict heart disease. The prediction of heart disease is performed using Ensemble of machine learning algorithms. This is to boost the accuracy achieved by individual machine learning algorithms. Method: Heart Disease Prediction System is developed where the user can input the patient details and the prediction for the particular patient is made using the model developed. The model will predict the output to be either normal or risky. Linear Discriminant Analysis (LDA), Classification and Regression Trees (CART), Support Vector Machines (SVM), K-Nearest Neighbors (KNN) and Naïve Bayes classifier are used as base learners. These algorithms are combined using random forest as the meta classifier. Results: The predictions of classifier are combined using random forest algorithm. The accuracy is lifted from 85.53% to 87.64% which is an impressive improvement on accuracy. Conclusion: Various techniques were adopted to preprocess the data to suite the requirement of analysis. Feature selections were made to optimize the performance of machine learning algorithms. Ensemble prediction gave better accuracy when combined using Random forest algorithm as combiner. Better feature selection techniques can be applied to further improve the accuracy.


Sign in / Sign up

Export Citation Format

Share Document