scholarly journals Heart Disease Prediction and Performance Assessment through Attribute Element Diminution using Machine Learning

In today’s modern world, the human beings are affected with heart disease irrespective of the age. With the advancement of technological growth, predicting the availability of Heart diseases still remains a challenging issue. The difficulty of predicting the heart disease prevails due to the lack of availability of the symptoms. According to World Health Organization, 33% of population died due to heart diseases. For this, the diagnosis of heart diseases is made by complex combination of clinical data. With this overview, we have used Heart Disease Prediction dataset extracted from UCI Machine Learning Repository for predicting the level of heart disease. The prediction of heart disease classes are achieved in four ways. Firstly, the data set is preprocessed with Feature Scaling and Missing Values. Secondly, the raw data set is fitted to classifiers like logistic regression, KNN classifier, Support Vector Machine, Kernel Support Vector Machine, Naive Bayes, Random Forest and Decision Tree classifiers. Third, the raw data set is subjected to dimensionality reduction using Principal Component Analysis to project the dataset with important components. The dimensionality PCA reduced data set is fitted to the above-mentioned classifiers. Fourth, the performance comparison of raw data set and PCA reduced data set is done by analyzing the performance metrics like Precision, Recall, Accuracy and F-score. The implementation is done using python language under Spyder platform with Anaconda Navigator. Experimental results shows that Random forest is found to be effective with the accuracy of 89% without applying PCA, 85% with five component PCA and 86% with seven component PCA.

2017 ◽  
Vol 10 (3) ◽  
pp. 683-690 ◽  
Author(s):  
Kamalpreet Kaur ◽  
O.P. Guptata

Maturity checking has become mandatory for the food industries as well as for the farmers so as to ensure that the fruits and vegetables are not diseased and are ripe. However, manual inspection leads to human error, unripe fruits and vegetables may decrease the production [3]. Thus, this study proposes a Tomato Classification system for determining maturity stages of tomato through Machine Learning which involves training of different algorithms like Decision Tree, Logistic Regression, Gradient Boosting, Random Forest, Support Vector Machine, K-NN and XG Boost. This system consists of image collection, feature extraction and training the classifiers on 80% of the total data. Rest 20% of the total data is used for the testing purpose. It is concluded from the results that the performance of the classifier depends on the size and kind of features extracted from the data set. The results are obtained in the form of Learning Curve, Confusion Matrix and Accuracy Score. It is observed that out of seven classifiers, Random Forest is successful with 92.49% accuracy due to its high capability of handling large set of data. Support Vector Machine has shown the least accuracy due to its inability to train large data set.


Author(s):  
Tamilarasi Suresh ◽  
Tsehay Admassu Assegie ◽  
Subhashni Rajkumar ◽  
Napa Komal Kumar

Heart disease is one of the most widely spreading and deadliest diseases across the world. In this study, we have proposed hybrid model for heart disease prediction by employing random forest and support vector machine. With random forest, iterative feature elimination is carried out to select heart disease features that improves predictive outcome of support vector machine for heart disease prediction. Experiment is conducted on the proposed model using test set and the experimental result evidently appears to prove that the performance of the proposed hybrid model is better as compared to an individual random forest and support vector machine. Overall, we have developed more accurate and computationally efficient model for heart disease prediction with accuracy of 98.3%. Moreover, experiment is conducted to analyze the effect of regularization parameter (C) and gamma on the performance of support vector machine. The experimental result evidently reveals that support vector machine is very sensitive to C and gamma.


2022 ◽  
Vol 19 ◽  
pp. 1-9
Author(s):  
Nikhil Bora ◽  
Sreedevi Gutta ◽  
Ahmad Hadaegh

Heart Disease has become one of the most leading cause of the death on the planet and it has become most life-threatening disease. The early prediction of the heart disease will help in reducing death rate. Predicting Heart Disease has become one of the most difficult challenges in the medical sector in recent years. As per recent statistics, about one person dies from heart disease every minute. In the realm of healthcare, a massive amount of data was discovered for which the data-science is critical for analyzing this massive amount of data. This paper proposes heart disease prediction using different machine-learning algorithms like logistic regression, naïve bayes, support vector machine, k nearest neighbor (KNN), random forest, extreme gradient boost, etc. These machine learning algorithm techniques we used to predict likelihood of person getting heart disease on the basis of features (such as cholesterol, blood pressure, age, sex, etc. which were extracted from the datasets. In our research we used two separate datasets. The first heart disease dataset we used was collected from very famous UCI machine learning repository which has 303 record instances with 14 different attributes (13 features and one target) and the second dataset that we used was collected from Kaggle website which contained 1190 patient’s record instances with 11 features and one target. This dataset is a combination of 5 popular datasets for heart disease. This study compares the accuracy of various machine learning techniques. In our research, for the first dataset we got the highest accuracy of 92% by Support Vector Machine (SVM). And for the second dataset, Random Forest gave us the highest accuracy of 94.12%. Then, we combined both the datasets which we used in our research for which we got the highest accuracy of 93.31% using Random Forest.


In the growing era of technological world, the people are suffered with various diseases. The common disease faced by the population irrespective of the age is the heart disease. Though the world is blooming in technological aspects, the prediction and the identification of the heart disease still remains a challenging issue. Due to the deficiency of the availability of patient symptoms, the prediction of heart disease is a disputed charge. With this overview, we have used Heart Disease Prediction dataset extorted from UCI Machine Learning Repository for the analysis and comparison of various parameters in the classification algorithms. The parameter analysis of various classification algorithms of heart disease classes are done in five ways. Firstly, the analysis of dataset is done by exploiting the correlation matrix, feature importance analysis, Target distribution of the dataset and Disease probability based on the density distribution of age and sex. Secondly, the dataset is fitted to K-Nearest Neighbor classifier to analyze the performance for the various combinations of neighbors with and without PCA. Thirdly, the dataset is fitted to Support Vector classifier to analyze the performance for the various combinations of kernels with and without PCA. Fourth, the dataset is fitted to Decision Tree classifier to analyze the performance for the various combinations of features with and without PCA. Fifth, the dataset is fitted to Random Forest classifier to analyze the performance for the various levels of estimators with and without PCA. The implementation is done using python language under Spyder platform with Anaconda Navigator. Experimental results shows that for KNN classifier, the performance for 12 neighbours is found to be effective with 0.52 before applying PCA and 0.53 after applying PCA. For Support Vector classifier, the rbf kernel is found to be effective with the score of 0.519 with and without PCA. For Decision Tree classifier, before applying PCA, the score is 0.47 for 7 features and after applying PCA, the score is 0.49 for 4 features. For, Random Forest Classifier, before applying PCA, the score is 0.53 for 500 estimators and after applying PCA, the score is 0.52 for 500 estimators.


Healthcare has become one of the most important concerns in the world. The cases of heart disease are increasing on a rapid scale among the people especially among the young generation. We can save the lives of the people if we could detect the heart disease on/before time, by getting them treated. In this matter artificial intelligence can be of a great help. Here we have collected a data set and then we have built a prediction model to detect heart disease based on the various algorithms that are available for machine learning.we have used Logistic regression, K-NN, SVM, Decision Tree, Random Forest with the accuracy values of K-Neighbors Classifier (0.956194%), Support Vector Machine (0.9561945%), Decision Tree (0.91050%), Random Forest Classifier (0.95404%) and Logistic Regression (0.95592%). The best value given by the Machine Learning model is by Logistic regression followed by K-NN.


2021 ◽  
Author(s):  
Karunakaran Velswamy ◽  
Rajasekar Velswamy ◽  
Iwin Thanakumar Joseph Swamidason

Abstract Now-a-days a healthcare field produces a huge amount of data, for processing those data some efficient techniques are required. In this paper, a classification model is developed for heart disease prediction and the attribute selection is carried out through a modified bee algorithm. The prediction of heart disease through models will help the practitioners to make a precise decision about patient health. Heart disease dataset is obtained from the UCI repository. Dataset consists of 76 features and all those seventy-six features have not contributed equal information during the time classification. In the entire attributes, some of the attributes have contributed a large amount of information at the time of classification and some of the attributes have contributed only a small amount of information during the classification task. In this paper, a modified bee algorithm is used to identify the best subset of features from the entire features in the dataset i.e., in the training phase of classification only retain those features that are contributing more information during classification and it will reduce the training time of classifiers. The experiment is analyzed with a obtained reduced subset of features by using the following classifiers such as Support Vector Machine, Navie bayes, Decision tree and Random forest. The experimental result shows that the Support Vector Machine classifier will provide a good classification accuracy, true positive rate, true negative rate, false positive rate and false negative rate compared to Navie bayes and Random forest tree classifier.


Author(s):  
Sarangam Kodati ◽  
Jeeva Selvaraj

Data mining is the most famous knowledge extraction approach for knowledge discovery from data (KDD). Machine learning is used to enable a program to analyze data, recognize correlations, and make usage on insights to solve issues and/or enrich data and because of prediction. The chapter highlights the need for more research within the usage of robust data mining methods in imitation of help healthcare specialists between the diagnosis regarding heart diseases and other debilitating disease conditions. Heart disease is the primary reason of death of people in the world. Nearly 47% of death is caused by heart disease. The authors use algorithms including random forest, naïve Bayes, support vector machine to analyze heart disease. Accuracy on the prediction stage is high when using a greater number of attributes. The goal is to function predictive evaluation using data mining, using data mining to analyze heart disease, and show which methods are effective and efficient.


2021 ◽  
Vol 36 (1) ◽  
pp. 616-622
Author(s):  
P. Harish ◽  
Dr.R. Sabitha

Aim: The objective of the work is to evaluate the accuracy and precision in predicting the heart disease using Support Vector Machine (SVM) and Random Forest (RF) classification algorithms. Materials and Methods: Random Forest Classifier is applied on a Health dataset that consists of 304 records. A framework for heart disease prediction in the medical sector comparing Random Forest and SVM classifiers has been proposed and developed. The sample size was measured as 21 per group. The accuracy and the precision of the classifiers was evaluated and recorded. Results: The SVM classifier produces 53.04% in predicting the heart disease on the data set used whereas the Random forest classifier predicts the same at the rate of 83.2%. The significant value is 0.0. Hence RF is better than SVM. Conclusion: The performance of Random forest is better compared with SVM in terms of both precision and accuracy.


Author(s):  
Ramesh Ponnala ◽  
K. Sai Sowjanya

Prediction of Cardiovascular ailment is an important task inside the vicinity of clinical facts evaluation. Machine learning knowledge of has been proven to be effective in helping in making selections and predicting from the huge amount of facts produced by using the healthcare enterprise. on this paper, we advocate a unique technique that pursuits via finding good sized functions by means of applying ML strategies ensuing in improving the accuracy inside the prediction of heart ailment. The severity of the heart disease is classified primarily based on diverse methods like KNN, choice timber and so on. The prediction version is added with special combos of capabilities and several known classification techniques. We produce a stronger performance level with an accuracy level of a 100% through the prediction version for heart ailment with the Hybrid Random forest area with a linear model (HRFLM).


Sign in / Sign up

Export Citation Format

Share Document