scholarly journals Ensembling Coalesce of Logistic Regression Classifier for Heart Disease Prediction using Machine Learning

In today’s modern world, the world population is affected with some kind of heart diseases. With the vast knowledge and advancement in applications, the analysis and the identification of the heart disease still remain as a challenging issue. Due to the lack of awareness in the availability of patient symptoms, the prediction of heart disease is a questionable task. The World Health Organization has released that 33% of population were died due to the attack of heart diseases. With this background, we have used Heart Disease Prediction dataset extracted from UCI Machine Learning Repository for analyzing and the prediction of heart disease by integrating the ensembling methods. The prediction of heart disease classes are achieved in four ways. Firstly, The important features are extracted for the various ensembling methods like Extra Trees Regressor, Ada boost regressor, Gradient booster regress, Random forest regressor and Ada boost classifier. Secondly, the highly importance features of each of the ensembling methods is filtered from the dataset and it is fitted to logistic regression classifier to analyze the performance. Thirdly, the same extracted important features of each of the ensembling methods are subjected to feature scaling and then fitted with logistic regression to analyze the performance. Fourth, the Performance analysis is done with the performance metric such as Mean Squared error (MSE), Mean Absolute error (MAE), R2 Score, Explained Variance Score (EVS) and Mean Squared Log Error (MSLE). The implementation is done using python language under Spyder platform with Anaconda Navigator. Experimental results shows that before applying feature scaling, the feature importance extracted from the Ada boost classifier is found to be effective with the MSE of 0.04, MAE of 0.07, R2 Score of 92%, EVS of 0.86 and MSLE of 0.16 as compared to other ensembling methods. Experimental results shows that after applying feature scaling, the feature importance extracted from the Ada boost classifier is found to be effective with the MSE of 0.09, MAE of 0.13, R2 Score of 91%, EVS of 0.93 and MSLE of 0.18 as compared to other ensembling methods.

Author(s):  
Aadar Pandita

: Heart disease has been one of the ruling causes for death for quite some time now. About 31% of all deaths every year in the world take place as a result of cardiovascular diseases [1]. A majority of the patients remain uninformed of their symptoms until quite late while others find it difficult to minimise the effects of risk factors that cause heart diseases. Machine Learning Algorithms have been quite efficacious in producing results with a high level of correctness thereby preventing the onset of heart diseases in many patients and reducing the impact in the ones that are already affected by such diseases. It has helped medical researchers and doctors all over the world in recognising patterns in the patients resulting in early detections of heart diseases.


Prediction of client behavior and their feedback remains as a challenging task in today’s world for all the manufacturing companies. The companies are struggling to increase their profit and annual turnover due to the lack of exact prediction of customer like and dislike. This leads to the accomplishment of machine learning algorithms for the prediction of customer demands. This paper attempts to identify the important features of the wine data set extracted from UCI Machine learning repository for the prediction of customer segment. The important features are extracted for the various ensembling methods like Ada boost regressor, Ada boost classifier, Random forest regressor, Extra Trees Regressor, Gradient booster regressor. The extracted feature importance of each of the ensembling methods is then fitted with logistic regression to analyze the performance. The same extracted feature importance of each of the ensembling methods are subjected to feature scaling and then fitted with logistic regression to analyze the performance. The Performance analysis is done with the performance metric such as Mean Squared error (MSE), Mean Absolute error (MAE), R2 Score, Explained Variance Score (EVS) and Mean Squared Log Error (MSLE). Experimental results shows that after applying feature scaling, the feature importance extracted from the Extra Tree Regressor is found to be effective with the MSE of 0.04, MAE of 0.03, R2 Score of 94%, EVS of 0.9 and MSLE of 0.01 as compared to other ensembling methods.


2021 ◽  
Vol 1 (4) ◽  
pp. 268-280
Author(s):  
Bamanga Mahmud , , , Ahmad ◽  
Ahmadu Asabe Sandra ◽  
Musa Yusuf Malgwi ◽  
Dahiru I. Sajoh

For the identification and prediction of different diseases, machine learning techniques are commonly used in clinical decision support systems. Since heart disease is the leading cause of death for both men and women around the world. Heart is one of the essential parts of human body, therefore, it is one of the most critical concerns in the medical domain, and several researchers have developed intelligent medical devices to support the systems and further to enhance the ability to diagnose and predict heart diseases. However, there are few studies that look at the capabilities of ensemble methods in developing a heart disease detection and prediction model. In this study, the researchers assessed that how to use ensemble model, which proposes a more stable performance than the use of base learning algorithm and these leads to better results than other heart disease prediction models. The University of California, Irvine (UCI) Machine Learning Repository archive was used to extract patient heart disease data records. To achieve the aim of this study, the researcher developed the meta-algorithm. The ensemble model is a superior solution in terms of high predictive accuracy and diagnostics output reliability, as per the results of the experiments. An ensemble heart disease prediction model is also presented in this work as a valuable, cost-effective, and timely predictive option with a user-friendly graphical user interface that is scalable and expandable. From the finding, the researcher suggests that Bagging is the best ensemble classifier to be adopted as the extended algorithm that has the high prediction probability score in the implementation of heart disease prediction.


2021 ◽  
Author(s):  
Santhosh Gupta Dogiparthi ◽  
Jayanthi K ◽  
Ajith Ananthakrishna Pillai

Abstract Objectives: The latest statistics of World Health Organization anticipated that cardiovascular diseases including Coronary Heart Disease, Heart attack, vascular disease as the biggest pandemic to the world due to which one-third of the world population would die. With the emerging AI trends, applying an optimal machine learning model to target early detection and accurate prediction of heart disease is indispensable to bring down the mortality rates and to treat the cardiac patients with best clinical decision support. This stems for the motivation of this paper. This paper presents a comprehensive survey on heart disease prediction models derived and validated out of popular heart disease datasets like Cleveland dataset, Z-Alizadeh Sani dataset. Methods: This survey was performed using the articles extricated from the Google Scholar, Scopus, Web of Science, Research Gate and PubMed search engines between 2005 to 2020. The main keywords for search were Heart Disease, Prediction, Coronary disease, Healthcare, Heart datasets and Machine Learning.Results: This review explores the shortcomings of various approaches used for the prediction of heart diseases. It outlines pros and cons of different research methodologies along with the validation parameters of each reviewed publication.Conclusion: The machine intelligence can serve as a genuine alternative diagnostic method for prediction, which will in turn keep the patients well aware of their illness state. Despite the researcher’s efforts, still uncertainty exist towards standardization of prediction models which demands further exploration of optimal prediction models.


Author(s):  
Shiva Shanta Mani B. ◽  
Manikandan V. M.

Heart disease is one of the most common and serious health issues in all the age groups. The food habits, mental stress, smoking, etc. are a few reasons for heart diseases. Diagnosing heart issues at an early stage is very much important to take proper treatment. The treatment of heart disease at the later stage is very expensive and risky. In this chapter, the authors discuss machine learning approaches to predict heart disease from a set of health parameters collected from a person. The heart disease dataset from the UCI machine learning repository is used for the study. This chapter discusses the heart disease prediction capability of four well-known machine learning approaches: naive Bayes classifier, KNN classifier, decision tree classifier, random forest classifier.


In today’s modern world, the human beings are affected with heart disease irrespective of the age. With the advancement of technological growth, predicting the availability of Heart diseases still remains a challenging issue. The difficulty of predicting the heart disease prevails due to the lack of availability of the symptoms. According to World Health Organization, 33% of population died due to heart diseases. For this, the diagnosis of heart diseases is made by complex combination of clinical data. With this overview, we have used Heart Disease Prediction dataset extracted from UCI Machine Learning Repository for predicting the level of heart disease. The prediction of heart disease classes are achieved in four ways. Firstly, the data set is preprocessed with Feature Scaling and Missing Values. Secondly, the raw data set is fitted to classifiers like logistic regression, KNN classifier, Support Vector Machine, Kernel Support Vector Machine, Naive Bayes, Random Forest and Decision Tree classifiers. Third, the raw data set is subjected to dimensionality reduction using Principal Component Analysis to project the dataset with important components. The dimensionality PCA reduced data set is fitted to the above-mentioned classifiers. Fourth, the performance comparison of raw data set and PCA reduced data set is done by analyzing the performance metrics like Precision, Recall, Accuracy and F-score. The implementation is done using python language under Spyder platform with Anaconda Navigator. Experimental results shows that Random forest is found to be effective with the accuracy of 89% without applying PCA, 85% with five component PCA and 86% with seven component PCA.


Recently, Heart disease (HD) is the main cause of increasing death rate all over the world. Data classification is a crucial task in the medical field which assists the physicians to predict the diseases. Recently, machine learning (ML) algorithms have been employed to classify the data in the medical field. The data complexity and quantity needs to be examined and managed to transform the efficient and accurate HD diagnosis. In this paper, a gradient boosting tree (GBT) based classifier or gradient boosting classifier (GBC) model to predict the HD efficiently. Besides, a set of extensive experiments were carried out using Staglog and Cleveland heart disease dataset. The experimental values ensured the superiority of the GBT classifier based on several performance measures.


Author(s):  
Chitluri Sai Harish B ◽  
G gnana krishna vamsi ◽  
G jaya phani akhil ◽  
J n v hari sravan ◽  
V mounika chowdary

Heart diseases are one of the most challenging problems faced by the Health Care sectors all over the world. These diseases are very basic now a days. With the expanding count of deaths because of heart illnesses, the necessity to build up a system to foresee heart ailments precisely. The work in this paper focuses on finding the best Machine Learning algorithm for identification of heart diseases. Our study compares the precision of three well known classification algorithms, Decision Tree and Naïve Bayes, Random Forest for the prediction of heart disease by making the use of dataset provided by Kaggle. We utilized various characteristics which relate with this heart diseases well, to find the better algorithm for prediction. The result of this study indicates that the Random Forest algorithm is the most efficient algorithm for prediction of heart disease with accuracy score of 97.17%.


Author(s):  
Aadar Pandita

Heart diseases have been the primary reason for death all over the world. Majority of the deaths related to cardiovascular problems are caused by heart attacks and strokes. The World Health Organization (WHO) indicates that an approximate 17.9 million people die due to such diseases every year. Therefore, it is essential that we find methods to ensure the minimization of these numbers. In order to minimize the detrimental effects of heart diseases, we must try to predict its presence at earlier stages. Machine Learning algorithms can help us effectively predict such results with a high degree of accuracy which can in turn help doctors and patients detect the onset of such diseases and reduce their impact or prevent them from occurring. Our objective is to create a system that is able to accurately determine the presence of heart disease in a time and cost efficient manner.


Author(s):  
Mr. Chitluri Sai Harish ◽  
◽  
Mr. G gnana krishna vamsi ◽  
Mr. G jaya phani akhil ◽  
Mr. J n v hari sravan ◽  
...  

Heart diseases are one of the most challenging problems faced by the Health Care sectors all over the world. These diseases are very basic now a days. With the expanding count of deaths because of heart illnesses, the necessity to build up a system to foresee heart ailments precisely. The work in this paper focuses on finding the best Machine Learning algorithm for identification of heart diseases. Our study compares the precision of three well known classification algorithms, Decision Tree and Naïve Bayes, Random Forest for the prediction of heart disease by making the use of dataset provided by Kaggle. We utilized various characteristics which relate with this heart diseases well, to find the better algorithm for prediction. The result of this study indicates that the Random Forest algorithm is the most efficient algorithm for prediction of heart disease with accuracy score of 97.17%.


Sign in / Sign up

Export Citation Format

Share Document