scholarly journals Classification Model for Heart Disease Prediction With Feature Selection Through Modified Bee Algorithm

Author(s):  
Karunakaran Velswamy ◽  
Rajasekar Velswamy ◽  
Iwin Thanakumar Joseph Swamidason

Abstract Now-a-days a healthcare field produces a huge amount of data, for processing those data some efficient techniques are required. In this paper, a classification model is developed for heart disease prediction and the attribute selection is carried out through a modified bee algorithm. The prediction of heart disease through models will help the practitioners to make a precise decision about patient health. Heart disease dataset is obtained from the UCI repository. Dataset consists of 76 features and all those seventy-six features have not contributed equal information during the time classification. In the entire attributes, some of the attributes have contributed a large amount of information at the time of classification and some of the attributes have contributed only a small amount of information during the classification task. In this paper, a modified bee algorithm is used to identify the best subset of features from the entire features in the dataset i.e., in the training phase of classification only retain those features that are contributing more information during classification and it will reduce the training time of classifiers. The experiment is analyzed with a obtained reduced subset of features by using the following classifiers such as Support Vector Machine, Navie bayes, Decision tree and Random forest. The experimental result shows that the Support Vector Machine classifier will provide a good classification accuracy, true positive rate, true negative rate, false positive rate and false negative rate compared to Navie bayes and Random forest tree classifier.

In today’s modern world, the human beings are affected with heart disease irrespective of the age. With the advancement of technological growth, predicting the availability of Heart diseases still remains a challenging issue. The difficulty of predicting the heart disease prevails due to the lack of availability of the symptoms. According to World Health Organization, 33% of population died due to heart diseases. For this, the diagnosis of heart diseases is made by complex combination of clinical data. With this overview, we have used Heart Disease Prediction dataset extracted from UCI Machine Learning Repository for predicting the level of heart disease. The prediction of heart disease classes are achieved in four ways. Firstly, the data set is preprocessed with Feature Scaling and Missing Values. Secondly, the raw data set is fitted to classifiers like logistic regression, KNN classifier, Support Vector Machine, Kernel Support Vector Machine, Naive Bayes, Random Forest and Decision Tree classifiers. Third, the raw data set is subjected to dimensionality reduction using Principal Component Analysis to project the dataset with important components. The dimensionality PCA reduced data set is fitted to the above-mentioned classifiers. Fourth, the performance comparison of raw data set and PCA reduced data set is done by analyzing the performance metrics like Precision, Recall, Accuracy and F-score. The implementation is done using python language under Spyder platform with Anaconda Navigator. Experimental results shows that Random forest is found to be effective with the accuracy of 89% without applying PCA, 85% with five component PCA and 86% with seven component PCA.


Author(s):  
Tamilarasi Suresh ◽  
Tsehay Admassu Assegie ◽  
Subhashni Rajkumar ◽  
Napa Komal Kumar

Heart disease is one of the most widely spreading and deadliest diseases across the world. In this study, we have proposed hybrid model for heart disease prediction by employing random forest and support vector machine. With random forest, iterative feature elimination is carried out to select heart disease features that improves predictive outcome of support vector machine for heart disease prediction. Experiment is conducted on the proposed model using test set and the experimental result evidently appears to prove that the performance of the proposed hybrid model is better as compared to an individual random forest and support vector machine. Overall, we have developed more accurate and computationally efficient model for heart disease prediction with accuracy of 98.3%. Moreover, experiment is conducted to analyze the effect of regularization parameter (C) and gamma on the performance of support vector machine. The experimental result evidently reveals that support vector machine is very sensitive to C and gamma.


Author(s):  
Tsehay Admassu Assegie

Machine-learning approaches have become greatly applicable in disease diagnosis and prediction process. This is because of the accuracy and better precision of the machine learning models in disease prediction. However, different machine learning models have different accuracy and precision on disease prediction. Selecting the better model that would result in better disease prediction accuracy and precision is an open research problem. In this study, we have proposed machine learning model for liver disease prediction using Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) learning algorithms and we have evaluated the accuracy and precision of the models on liver disease prediction using the Indian liver disease data repository. The analysis of result showed 82.90% accuracy for SVM and 72.64% accuracy for the KNN algorithm. Based on the accuracy score of SVM and KNN on experimental test results, the SVM is better in performance on the liver disease prediction than the KNN algorithm.  


2021 ◽  
Vol 42 (Supplement_1) ◽  
Author(s):  
M J Espinosa Pascual ◽  
P Vaquero Martinez ◽  
V Vaquero Martinez ◽  
J Lopez Pais ◽  
B Izquierdo Coronel ◽  
...  

Abstract Introduction Out of all patients admitted with Myocardial Infarction, 10 to 15% have Myocardial Infarction with Non-Obstructive Coronaries Arteries (MINOCA). Classification algorithms based on deep learning substantially exceed traditional diagnostic algorithms. Therefore, numerous machine learning models have been proposed as useful tools for the detection of various pathologies, but to date no study has proposed a diagnostic algorithm for MINOCA. Purpose The aim of this study was to estimate the diagnostic accuracy of several automated learning algorithms (Support-Vector Machine [SVM], Random Forest [RF] and Logistic Regression [LR]) to discriminate between people suffering from MINOCA from those with Myocardial Infarction with Obstructive Coronary Artery Disease (MICAD) at the time of admission and before performing a coronary angiography, whether invasive or not. Methods A Diagnostic Test Evaluation study was carried out applying the proposed algorithms to a database constituted by 553 consecutive patients admitted to our Hospital with Myocardial Infarction. According to the definitions of 2016 ESC Position Paper on MINOCA, patients were classified into two groups: MICAD and MINOCA. Out of the total 553 patients, 214 were discarded due to the lack of complete data. The set of machine learning algorithms was trained on 244 patients (training sample: 75%) and tested on 80 patients (test sample: 25%). A total of 64 variables were available for each patient, including demographic, clinical and laboratorial features before the angiographic procedure. Finally, the diagnostic precision of each architecture was taken. Results The most accurate classification model was the Random Forest algorithm (Specificity [Sp] 0.88, Sensitivity [Se] 0.57, Negative Predictive Value [NPV] 0.93, Area Under the Curve [AUC] 0.85 [CI 0.83–0.88]) followed by the standard Logistic Regression (Sp 0.76, Se 0.57, NPV 0.92 AUC 0.74 and Support-Vector Machine (Sp 0.84, Se 0.38, NPV 0.90, AUC 0.78) (see graph). The variables that contributed the most in order to discriminate a MINOCA from a MICAD were the traditional cardiovascular risk factors, biomarkers of myocardial injury, hemoglobin and gender. Results were similar when the 19 patients with Takotsubo syndrome were excluded from the analysis. Conclusion A prediction system for diagnosing MINOCA before performing coronary angiographies was developed using machine learning algorithms. Results show higher accuracy of diagnosing MINOCA than conventional statistical methods. This study supports the potential of machine learning algorithms in clinical cardiology. However, further studies are required in order to validate our results. FUNDunding Acknowledgement Type of funding sources: None. ROC curves of different algorithms


Author(s):  
L. E. Christovam ◽  
G. G. Pessoa ◽  
M. H. Shimabukuro ◽  
M. L. B. T. Galo

<p><strong>Abstract.</strong> Land Use and Land Cover (LULC) information is an important data source for modeling environmental variables, so it is essential to develop high quality LULC maps. The hundreds of continuous spectral bands gathered with hyperspectral sensors provide high spectral detail and consequently confirm hyperspectral remote sensing as an appropriate option for many LULC applications. Despite increased spectral detail, issues like high dimensionality, huge volume of data and redundant information, mean that hyperspectral image classification is a complex task. It is therefore essential to develop classification approaches that deals with these issues. Since classification results are directly dependent on the dataset used, it is fundamental to compare and validate the classification approaches in public datasets. With this in mind, aiming to provide a baseline, four classification models in the relatively new hyperspectral HyRANK dataset were evaluated. The classification models were defined with three well-known classification algorithms: Spectral Angle Mapper (SAM), Support Vector Machine (SVM) and Random Forest (RF). A classification model with SAM and another with RF were defined with the 176 surface reflectance bands. A dimensionality reduction with principal component analysis was carried out and a classification model with SVM and another with RF were defined using 14 principal components as features. The results show that SVM and RF algorithms outperformed by far the SAM in terms of accuracy, and that the RF is slightly better than the SVM in this respect. It is also possible to see from the results that the use of principal components as features provided an improvement in the accuracy of the RF and an improvement of 28% in the time spent fitting the classification model.</p>


The patient’s heart disease status is obtained by using a heart disease detection model. That is used for the medical experts. In order to predict the heart disease, the existing technique use optimal classifier. Even though the existing technique achieved the better result, it has some disadvantages. In order to improve those drawbacks, the suggested technique utilizes the effective method for heart disease prediction. At first the input information is preprocessed and then the preprocessed result is forwarded to the feature selection process. For the feature selection process a proficient feature selection is used over the high dimensional medical data. Hybrid Fish Bee optimization algorithm (HFSBEE) is utilized. Thus, the proposed algorithm parallelizes the two algorithms such that the local behavior of artificial bee colony algorithm and global search of fish swarm optimization are effectively used to find the optimal solution. Classification process is performed by the transformation of medical dataset to the Multi kernel support vector machine (MKSVM). The process of our proposed technique is calculated based on the accuracy, sensitivity, specificity, precision, recall and F-measure. Here, for test analysis, the some datasets used i.e. Cleveland, Hungarian and Switzerland etc., that are given based on the UCI machine learning repository. The experimental outcome show that our presented technique is went better than the accuracy of 97.68%. This is for the Cleveland dataset when related with existing hybrid kernel support vector machine (HKSVM) method achieved 96.03% and optimal rough fuzzy classifier obtained 62.25%. The implementation of the proposed method is done by MATLAB platform.


Sign in / Sign up

Export Citation Format

Share Document