scholarly journals Developing a Hyperparameter Tuning Based Machine Learning Approach of Heart Disease Prediction

2020 ◽  
Vol 7 (2) ◽  
pp. 631-647
Author(s):  
Emrana Kabir Hashi ◽  
Md. Shahid Uz Zaman

Machine learning techniques are widely used in healthcare sectors to predict fatal diseases. The objective of this research was to develop and compare the performance of the traditional system with the proposed system that predicts the heart disease implementing the Logistic regression, K-nearest neighbor, Support vector machine, Decision tree, and Random Forest classification models. The proposed system helped to tune the hyperparameters using the grid search approach to the five mentioned classification algorithms. The performance of the heart disease prediction system is the major research issue. With the hyperparameter tuning model, it can be used to enhance the performance of the prediction models. The achievement of the traditional and proposed system was evaluated and compared in terms of accuracy, precision, recall, and F1 score. As the traditional system achieved accuracies between 81.97% and 90.16%., the proposed hyperparameter tuning model achieved accuracies in the range increased between 85.25% and 91.80%. These evaluations demonstrated that the proposed prediction approach is capable of achieving more accurate results compared with the traditional approach in predicting heart disease with the acquisition of feasible performance.

Deriving the methodologies to detect heart issues at an earlier stage and intimating the patient to improve their health. To resolve this problem, we will use Machine Learning techniques to predict the incidence at an earlier stage. We have a tendency to use sure parameters like age, sex, height, weight, case history, smoking and alcohol consumption and test like pressure ,cholesterol, diabetes, ECG, ECHO for prediction. In machine learning there are many algorithms which will be used to solve this issue. The algorithms include K-Nearest Neighbour, Support vector classifier, decision tree classifier, logistic regression and Random Forest classifier. Using these parameters and algorithms we need to predict whether or not the patient has heart disease or not and recommend the patient to improve his/her health.


2021 ◽  
Vol 1 (4) ◽  
pp. 268-280
Author(s):  
Bamanga Mahmud , , , Ahmad ◽  
Ahmadu Asabe Sandra ◽  
Musa Yusuf Malgwi ◽  
Dahiru I. Sajoh

For the identification and prediction of different diseases, machine learning techniques are commonly used in clinical decision support systems. Since heart disease is the leading cause of death for both men and women around the world. Heart is one of the essential parts of human body, therefore, it is one of the most critical concerns in the medical domain, and several researchers have developed intelligent medical devices to support the systems and further to enhance the ability to diagnose and predict heart diseases. However, there are few studies that look at the capabilities of ensemble methods in developing a heart disease detection and prediction model. In this study, the researchers assessed that how to use ensemble model, which proposes a more stable performance than the use of base learning algorithm and these leads to better results than other heart disease prediction models. The University of California, Irvine (UCI) Machine Learning Repository archive was used to extract patient heart disease data records. To achieve the aim of this study, the researcher developed the meta-algorithm. The ensemble model is a superior solution in terms of high predictive accuracy and diagnostics output reliability, as per the results of the experiments. An ensemble heart disease prediction model is also presented in this work as a valuable, cost-effective, and timely predictive option with a user-friendly graphical user interface that is scalable and expandable. From the finding, the researcher suggests that Bagging is the best ensemble classifier to be adopted as the extended algorithm that has the high prediction probability score in the implementation of heart disease prediction.


2020 ◽  
Vol 11 (2) ◽  
pp. 20-40
Author(s):  
Somya Goyal ◽  
Pradeep Kumar Bhatia

Software quality prediction is one the most challenging tasks in the development and maintenance of software. Machine learning (ML) is widely being incorporated for the prediction of the quality of a final product in the early development stages of the software development life cycle (SDLC). An ML prediction model uses software metrics and faulty data from previous projects to detect high-risk modules for future projects, so that the testing efforts can be targeted to those specific ‘risky' modules. Hence, ML-based predictors contribute to the detection of development anomalies early and inexpensively and ensure the timely delivery of a successful, failure-free and supreme quality software product within budget. This article has a comparison of 30 software quality prediction models (5 technique * 6 dataset) built on five ML techniques: artificial neural network (ANN); support vector machine (SVMs); Decision Tree (DTs); k-Nearest Neighbor (KNN); and Naïve Bayes Classifiers (NBC), using six datasets: CM1, KC1, KC2, PC1, JM1, and a combined one. These models exploit the predictive power of static code metrics, McCabe complexity metrics, for quality prediction. All thirty predictors are compared using a receiver operator curve (ROC), area under the curve (AUC), and accuracy as performance evaluation criteria. The results show that the ANN technique for software quality prediction is promising for accurate quality prediction irrespective of the dataset used.


Author(s):  
Rony Chowdhury Ripan ◽  
Iqbal H. Sarker ◽  
Md. Hasan Furhad ◽  
Md Musfique Anwar ◽  
Mohammed Moshiul Hoque

This paper presents an effective heart disease prediction model through detecting the anomalies, also known as outliers, in healthcare data using the unsupervised K-means clustering algorithm. Most existing approaches for detecting anomalies are based on constructing profiles of normal instances. However, such techniques require an adequate number of normal profiles to justify those models. Our proposed model first evaluates an \textit{optimal} value of K using Silhouette method. Next, it intends to locate anomalies that are far from a certain threshold distance with respect to their clusters. Finally, the five most popular classification techniques such as K-Nearest Neighbor (KNN), Random Forest (RF), Support Vector Machines (SVM), Naive Bayes (NB), and Logistic Regression (LR) are applied to build the resultant prediction model. The effectiveness of the proposed methodology is justified using a benchmark dataset of heart disease.


2022 ◽  
Vol 19 ◽  
pp. 1-9
Author(s):  
Nikhil Bora ◽  
Sreedevi Gutta ◽  
Ahmad Hadaegh

Heart Disease has become one of the most leading cause of the death on the planet and it has become most life-threatening disease. The early prediction of the heart disease will help in reducing death rate. Predicting Heart Disease has become one of the most difficult challenges in the medical sector in recent years. As per recent statistics, about one person dies from heart disease every minute. In the realm of healthcare, a massive amount of data was discovered for which the data-science is critical for analyzing this massive amount of data. This paper proposes heart disease prediction using different machine-learning algorithms like logistic regression, naïve bayes, support vector machine, k nearest neighbor (KNN), random forest, extreme gradient boost, etc. These machine learning algorithm techniques we used to predict likelihood of person getting heart disease on the basis of features (such as cholesterol, blood pressure, age, sex, etc. which were extracted from the datasets. In our research we used two separate datasets. The first heart disease dataset we used was collected from very famous UCI machine learning repository which has 303 record instances with 14 different attributes (13 features and one target) and the second dataset that we used was collected from Kaggle website which contained 1190 patient’s record instances with 11 features and one target. This dataset is a combination of 5 popular datasets for heart disease. This study compares the accuracy of various machine learning techniques. In our research, for the first dataset we got the highest accuracy of 92% by Support Vector Machine (SVM). And for the second dataset, Random Forest gave us the highest accuracy of 94.12%. Then, we combined both the datasets which we used in our research for which we got the highest accuracy of 93.31% using Random Forest.


In medical science, heart disease is being considered as fatal problem and in every seconds most of the people dies due to this problem. In heart disease, typically heart stops blood supply to other parts of the body. Hence, proper functioning of body stopped and affected. In this way, timely and accurate prediction of heart disease is an important concern in medical science domain. Diagnosing of heart patients with previous medical history is not being considered as reliable in many aspects. However, machine learning techniques have mystery to classify heart disease data efficiently and effectively and provide reliable solutions. In the past, prediction of heart disease problem various machine learning tools and techniques have been adopted. In this study, hybrid ensemble classification techniques like bagging, boosting, Random Subspace Method (RSM) and Random Under Sampling (RUS) boost are proposed and performance is compared with simple base classification techniques like decision tree, logistic regression, Naive Bays, Support Vector Machine, k-Nearest Neighbor (KNN), Bays Net (BN) and Multi Layer Perceptron (MLP). The heart disease dataset from Kaggle data source containing 305 samples and Matlab R2017a machine learning tool are considered for performance evaluation. Finally, the experimental results stated that hybrid ensemble classification methods outperforms than simple base classification methods in terms of accuracy


The heart is more important to the human body than any other circulatory organs. Its function is to provide and pump blood to other organs and brain. So it is very important to have a healthy heart but researches revealed the risk of heart failure increases every day starting from age 30. Many heart specialist can diagnose heart disease with their experience and skills. But some experts lacking the talent or knowledge to predict cardiovascular disease in the early stages, a small mistake can cost a patient’s life. Therefore, it is necessary to use specific methods and algorithmic tools to estimate the occurrence of cardiac disorders in the early stages. Different Algorithms for machine learning and data analysis are beneficial in predicting various diseases from patient’s data, managed by the Medical Center or hospitals. The data obtained may also help to assess the presence of the disease in the future. Heart Disease or Cardiac related issues can be analyzed by variety of machine learning techniques, Instance Artificial Neural Network, Decision Tree, Random forest, K-nearest neighbor, Naïve Bayes and Support Vector Machine. This study establishes a theoretical understanding of existing algorithms and provides a general understanding of existing work.


2021 ◽  
Vol 10 (1) ◽  
pp. 46
Author(s):  
Maria Yousef ◽  
Prof. Khaled Batiha

These days, heart disease comes to be one of the major health problems which have affected the lives of people in the whole world. Moreover, death due to heart disease is increasing day by day. So the heart disease prediction systems play an important role in the prevention of heart problems. Where these prediction systems assist doctors in making the right decision to diagnose heart disease easily. The existing prediction systems suffering from the high dimensionality problem of selected features that increase the prediction time and decrease the performance accuracy of the prediction due to many redundant or irrelevant features. Therefore, this paper aims to provide a solution of the dimensionality problem by proposing a new mixed model for heart disease prediction based on (Naïve Bayes method, and machine learning classifiers).In this study, we proposed a new heart disease prediction model (NB-SKDR) based on the Naïve Bayes algorithm (NB) and several machine learning techniques including Support Vector Machine, K-Nearest Neighbors, Decision Tree, and Random Forest. This prediction model consists of three main phases which include: preprocessing, feature selection, and classification. The main objective of this proposed model is to improve the performance of the prediction system and finding the best subset of features. This proposed approach uses the Naïve Bayes technique based on the Bayes theorem to select the best subset of features for the next classification phase, also to handle the high dimensionality problem by avoiding unnecessary features and select only the important ones in an attempt to improve the efficiency and accuracy of classifiers. This method is able to reduce the number of features from 13 to 6 which are (age, gender, blood pressure, fasting blood sugar, cholesterol, exercise induce engine) by determining the dependency between a set of attributes. The dependent attributes are the attributes in which an attribute depends on the other attribute in deciding the value of the class attribute. The dependency between attributes is measured by the conditional probability, which can be easily computed by Bayes theorem. Moreover, in the classification phase, the proposed system uses different classification algorithms such as (DT Decision Tree, RF Random Forest, SVM Support Vector machine, KNN Nearest Neighbors) as a classifiers for predicting whether a patient has heart disease or not. The model is trained and evaluated using the Cleveland Heart Disease database, which contains 13 features and 303 samples.Different algorithms use different rules for producing different representations of knowledge. So, the selection of algorithms to build our model is based on their performance. In this work, we applied and compared several classification algorithms which are (DT, SVM, RF, and KNN) to identify the best-suited algorithm to achieve high accuracy in the prediction of heart disease. After combining the Naive Bayes method with each one of these previous classifiers the performance of these combines algorithms is evaluated by different performance metrics such as (Specificity, Sensitivity, and Accuracy). Where the experimental results show that out of these four classification models, the combination between the Naive Bayes feature selection approach and the SVM RBF classifier can predict heart disease with the highest accuracy of 98%. Finally, the proposed approach is compared with another two systems which developed based on two different approaches in the feature selection step. The first system, based on the Genetic Algorithm (GA) technique, and the second uses the Principal Component Analysis (PCA) technique. Consequently, the comparison proved that the Naive Bayes selection approach of the proposed system is better than the GA and PCA approach in terms of prediction accuracy.   


Recent advancement of technology allows the automation of things to be done using machine learning techniques. These machine learning techniques can also be used for detecting or predicting the heart disease in the early phase. The health care industry produces a huge amount of data which is in unstructured manner that cannot be understood by a machine. Due to development of modern technology, health care industries also managing the data in a structured manner which can be understood by machine learning technology. In this environment if we use machine learning algorithms for prediction of heart disease, then there is a chance to detect the heart disease status in the early phase and to alert patient to get a better treatment to cure that disease. This paper implements seven supervised learning algorithms which are KNN, Decision Tree, Naive Bayes, Logistic Regression, Random Forest, Support Vector Machine and Neural Networks for heart disease prediction. This paper generates algorithm performance metrics like Accuracy, Precision, Recall, F-score and ROC values for how the system was predicting accurately. In this paper among those seven algorithms, Neural Networks gave best accuracy as 92.30% and this system provides experimental results for how the model is accurate for heart disease prediction.


Author(s):  
Anantvir Singh Romana

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.


Sign in / Sign up

Export Citation Format

Share Document