scholarly journals A hybrid approach to medical decision-making: diagnosis of heart disease with machine-learning model

Author(s):  
Tamilarasi Suresh ◽  
Tsehay Admassu Assegie ◽  
Subhashni Rajkumar ◽  
Napa Komal Kumar

Heart disease is one of the most widely spreading and deadliest diseases across the world. In this study, we have proposed hybrid model for heart disease prediction by employing random forest and support vector machine. With random forest, iterative feature elimination is carried out to select heart disease features that improves predictive outcome of support vector machine for heart disease prediction. Experiment is conducted on the proposed model using test set and the experimental result evidently appears to prove that the performance of the proposed hybrid model is better as compared to an individual random forest and support vector machine. Overall, we have developed more accurate and computationally efficient model for heart disease prediction with accuracy of 98.3%. Moreover, experiment is conducted to analyze the effect of regularization parameter (C) and gamma on the performance of support vector machine. The experimental result evidently reveals that support vector machine is very sensitive to C and gamma.

In today’s modern world, the human beings are affected with heart disease irrespective of the age. With the advancement of technological growth, predicting the availability of Heart diseases still remains a challenging issue. The difficulty of predicting the heart disease prevails due to the lack of availability of the symptoms. According to World Health Organization, 33% of population died due to heart diseases. For this, the diagnosis of heart diseases is made by complex combination of clinical data. With this overview, we have used Heart Disease Prediction dataset extracted from UCI Machine Learning Repository for predicting the level of heart disease. The prediction of heart disease classes are achieved in four ways. Firstly, the data set is preprocessed with Feature Scaling and Missing Values. Secondly, the raw data set is fitted to classifiers like logistic regression, KNN classifier, Support Vector Machine, Kernel Support Vector Machine, Naive Bayes, Random Forest and Decision Tree classifiers. Third, the raw data set is subjected to dimensionality reduction using Principal Component Analysis to project the dataset with important components. The dimensionality PCA reduced data set is fitted to the above-mentioned classifiers. Fourth, the performance comparison of raw data set and PCA reduced data set is done by analyzing the performance metrics like Precision, Recall, Accuracy and F-score. The implementation is done using python language under Spyder platform with Anaconda Navigator. Experimental results shows that Random forest is found to be effective with the accuracy of 89% without applying PCA, 85% with five component PCA and 86% with seven component PCA.


2021 ◽  
Author(s):  
Karunakaran Velswamy ◽  
Rajasekar Velswamy ◽  
Iwin Thanakumar Joseph Swamidason

Abstract Now-a-days a healthcare field produces a huge amount of data, for processing those data some efficient techniques are required. In this paper, a classification model is developed for heart disease prediction and the attribute selection is carried out through a modified bee algorithm. The prediction of heart disease through models will help the practitioners to make a precise decision about patient health. Heart disease dataset is obtained from the UCI repository. Dataset consists of 76 features and all those seventy-six features have not contributed equal information during the time classification. In the entire attributes, some of the attributes have contributed a large amount of information at the time of classification and some of the attributes have contributed only a small amount of information during the classification task. In this paper, a modified bee algorithm is used to identify the best subset of features from the entire features in the dataset i.e., in the training phase of classification only retain those features that are contributing more information during classification and it will reduce the training time of classifiers. The experiment is analyzed with a obtained reduced subset of features by using the following classifiers such as Support Vector Machine, Navie bayes, Decision tree and Random forest. The experimental result shows that the Support Vector Machine classifier will provide a good classification accuracy, true positive rate, true negative rate, false positive rate and false negative rate compared to Navie bayes and Random forest tree classifier.


2018 ◽  
Vol 127 ◽  
pp. 511-520 ◽  
Author(s):  
Yassine Al Amrani ◽  
Mohamed Lazaar ◽  
Kamal Eddine El Kadiri

The patient’s heart disease status is obtained by using a heart disease detection model. That is used for the medical experts. In order to predict the heart disease, the existing technique use optimal classifier. Even though the existing technique achieved the better result, it has some disadvantages. In order to improve those drawbacks, the suggested technique utilizes the effective method for heart disease prediction. At first the input information is preprocessed and then the preprocessed result is forwarded to the feature selection process. For the feature selection process a proficient feature selection is used over the high dimensional medical data. Hybrid Fish Bee optimization algorithm (HFSBEE) is utilized. Thus, the proposed algorithm parallelizes the two algorithms such that the local behavior of artificial bee colony algorithm and global search of fish swarm optimization are effectively used to find the optimal solution. Classification process is performed by the transformation of medical dataset to the Multi kernel support vector machine (MKSVM). The process of our proposed technique is calculated based on the accuracy, sensitivity, specificity, precision, recall and F-measure. Here, for test analysis, the some datasets used i.e. Cleveland, Hungarian and Switzerland etc., that are given based on the UCI machine learning repository. The experimental outcome show that our presented technique is went better than the accuracy of 97.68%. This is for the Cleveland dataset when related with existing hybrid kernel support vector machine (HKSVM) method achieved 96.03% and optimal rough fuzzy classifier obtained 62.25%. The implementation of the proposed method is done by MATLAB platform.


Author(s):  
Aamir Khan ◽  
Dr. Sanjay Jain

The data mining (DM) is a process that deals with mining of valuable information from the rough data. The method of prediction analysis (PA) is implemented for predicting the future possibilities on the basis of current information. This research work is planned on the basis of predicting the heart disease. The coronary disorder can be forecasted in different phases in which pre-processing is done, attributes are extracted and classification is performed. The hybrid method is introduced on the basis of RF and LR.The Random Forest classification is adopted to extract the attributes and the classification process is carried out using logistic regression. The analysis of performance of introduced system is done with regard to accuracy, precision and recall. It is indicated that the introduced system will be provided accuracy approximately above 90% while predicting the heart disease.


2022 ◽  
Vol 19 ◽  
pp. 1-9
Author(s):  
Nikhil Bora ◽  
Sreedevi Gutta ◽  
Ahmad Hadaegh

Heart Disease has become one of the most leading cause of the death on the planet and it has become most life-threatening disease. The early prediction of the heart disease will help in reducing death rate. Predicting Heart Disease has become one of the most difficult challenges in the medical sector in recent years. As per recent statistics, about one person dies from heart disease every minute. In the realm of healthcare, a massive amount of data was discovered for which the data-science is critical for analyzing this massive amount of data. This paper proposes heart disease prediction using different machine-learning algorithms like logistic regression, naïve bayes, support vector machine, k nearest neighbor (KNN), random forest, extreme gradient boost, etc. These machine learning algorithm techniques we used to predict likelihood of person getting heart disease on the basis of features (such as cholesterol, blood pressure, age, sex, etc. which were extracted from the datasets. In our research we used two separate datasets. The first heart disease dataset we used was collected from very famous UCI machine learning repository which has 303 record instances with 14 different attributes (13 features and one target) and the second dataset that we used was collected from Kaggle website which contained 1190 patient’s record instances with 11 features and one target. This dataset is a combination of 5 popular datasets for heart disease. This study compares the accuracy of various machine learning techniques. In our research, for the first dataset we got the highest accuracy of 92% by Support Vector Machine (SVM). And for the second dataset, Random Forest gave us the highest accuracy of 94.12%. Then, we combined both the datasets which we used in our research for which we got the highest accuracy of 93.31% using Random Forest.


Sign in / Sign up

Export Citation Format

Share Document