scholarly journals Heart Disease Prediction using Machine Learning

Deriving the methodologies to detect heart issues at an earlier stage and intimating the patient to improve their health. To resolve this problem, we will use Machine Learning techniques to predict the incidence at an earlier stage. We have a tendency to use sure parameters like age, sex, height, weight, case history, smoking and alcohol consumption and test like pressure ,cholesterol, diabetes, ECG, ECHO for prediction. In machine learning there are many algorithms which will be used to solve this issue. The algorithms include K-Nearest Neighbour, Support vector classifier, decision tree classifier, logistic regression and Random Forest classifier. Using these parameters and algorithms we need to predict whether or not the patient has heart disease or not and recommend the patient to improve his/her health.

In the growing era of technological world, the people are suffered with various diseases. The common disease faced by the population irrespective of the age is the heart disease. Though the world is blooming in technological aspects, the prediction and the identification of the heart disease still remains a challenging issue. Due to the deficiency of the availability of patient symptoms, the prediction of heart disease is a disputed charge. With this overview, we have used Heart Disease Prediction dataset extorted from UCI Machine Learning Repository for the analysis and comparison of various parameters in the classification algorithms. The parameter analysis of various classification algorithms of heart disease classes are done in five ways. Firstly, the analysis of dataset is done by exploiting the correlation matrix, feature importance analysis, Target distribution of the dataset and Disease probability based on the density distribution of age and sex. Secondly, the dataset is fitted to K-Nearest Neighbor classifier to analyze the performance for the various combinations of neighbors with and without PCA. Thirdly, the dataset is fitted to Support Vector classifier to analyze the performance for the various combinations of kernels with and without PCA. Fourth, the dataset is fitted to Decision Tree classifier to analyze the performance for the various combinations of features with and without PCA. Fifth, the dataset is fitted to Random Forest classifier to analyze the performance for the various levels of estimators with and without PCA. The implementation is done using python language under Spyder platform with Anaconda Navigator. Experimental results shows that for KNN classifier, the performance for 12 neighbours is found to be effective with 0.52 before applying PCA and 0.53 after applying PCA. For Support Vector classifier, the rbf kernel is found to be effective with the score of 0.519 with and without PCA. For Decision Tree classifier, before applying PCA, the score is 0.47 for 7 features and after applying PCA, the score is 0.49 for 4 features. For, Random Forest Classifier, before applying PCA, the score is 0.53 for 500 estimators and after applying PCA, the score is 0.52 for 500 estimators.


2019 ◽  
Vol 8 (4) ◽  
pp. 10316-10320

Nowadays, heart disease has become a major disease among the people irrespective of the age. We are seeing this even in children dying due to the heart disease. If we can predict this even before they die, there may be huge chances of surviving. Everybody has various qualities of beat rate (pulse rate) and circulatory strain (blood pressure). We are living in a period of data. Due to the rise in the technology, the amount of data that is generated is increasing daily. Some terabytes of data are being produced and stored. For example, the huge amount of data about the patients is produced in the hospitals such as chest pain, heart rate, blood pressure, pulse rate etc. If we can get this data and apply some machine learning techniques, we can reduce the probability of people dying. In this paper we have done survey using different classification and grouping strategies, for example, KNN, Decision tree classifier, Gaussian Naïve Bayes, Support vector machine, Linear regression, Logistic regression, Random forest classifier, Random forest regression, linear descriptive analysis. We have taken the 14 attributes that are present in the dataset as an input and applying on the dataset which is taken from the UCI repository to develop and accurate model of predicting the heart disease contains colossal (huge) therapeutic (medical) information. In the proposed research, the exhibition of the conclusion model is acquired by using utilizing classification strategies. In this paper proposed an accuracy model to predict whether a person has coronary disease or not. This is implemented by comparing the accuracies of different machine-learning strategies such as KNN, Decision tree classifier, Gaussian Naïve Bayes, SVM, Logistic regression, Random forest classifier, Linear regression, Random forest regression, linear descriptive analysis


Author(s):  
Cristián Castillo-Olea ◽  
Begonya Garcia-Zapirain Soto ◽  
Clemente Zuñiga

The article presents a study based on timeline data analysis of the level of sarcopenia in older patients in Baja California, Mexico. Information was examined at the beginning of the study (first event), three months later (second event), and six months later (third event). Sarcopenia is defined as the loss of muscle mass quality and strength. The study was conducted with 166 patients. A total of 65% were women and 35% were men. The mean age of the enrolled patients was 77.24 years. The research included 99 variables that consider medical history, pharmacology, psychological tests, comorbidity (Charlson), functional capacity (Barthel and Lawton), undernourishment (mini nutritional assessment (MNA) validated test), as well as biochemical and socio-demographic data. Our aim was to evaluate the prevalence of the level of sarcopenia in a population of chronically ill patients assessed at the Tijuana General Hospital. We used machine learning techniques to assess and identify the determining variables to focus on the patients’ evolution. The following classifiers were used: Support Vector Machines, Linear Support Vector Machines, Radial Basis Function, Gaussian process, Decision Tree, Random Forest, multilayer perceptron, AdaBoost, Gaussian Naive Bayes, and Quadratic Discriminant Analysis. In order of importance, we found that the following variables determine the level of sarcopenia: Age, Systolic arterial hypertension, mini nutritional assessment (MNA), Number of chronic diseases, and Sodium. They are therefore considered relevant in the decision-making process of choosing treatment or prevention. Analysis of the relationship between the presence of the variables and the classifiers used to measure sarcopenia revealed that the Decision Tree classifier, with the Age, Systolic arterial hypertension, MNA, Number of chronic diseases, and Sodium variables, showed a precision of 0.864, accuracy of 0.831, and an F1 score of 0.900 in the first and second events. Precision of 0.867, accuracy of 0.825, and an F1 score of 0.867 were obtained in event three with the same variables. We can therefore conclude that the Decision Tree classifier yields the best results for the assessment of the determining variables and suggests that the study population’s sarcopenia did not change from moderate to severe.


Recent advancement of technology allows the automation of things to be done using machine learning techniques. These machine learning techniques can also be used for detecting or predicting the heart disease in the early phase. The health care industry produces a huge amount of data which is in unstructured manner that cannot be understood by a machine. Due to development of modern technology, health care industries also managing the data in a structured manner which can be understood by machine learning technology. In this environment if we use machine learning algorithms for prediction of heart disease, then there is a chance to detect the heart disease status in the early phase and to alert patient to get a better treatment to cure that disease. This paper implements seven supervised learning algorithms which are KNN, Decision Tree, Naive Bayes, Logistic Regression, Random Forest, Support Vector Machine and Neural Networks for heart disease prediction. This paper generates algorithm performance metrics like Accuracy, Precision, Recall, F-score and ROC values for how the system was predicting accurately. In this paper among those seven algorithms, Neural Networks gave best accuracy as 92.30% and this system provides experimental results for how the model is accurate for heart disease prediction.


Author(s):  
Anantvir Singh Romana

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.


Author(s):  
Ramesh Ponnala ◽  
K. Sai Sowjanya

Prediction of Cardiovascular ailment is an important task inside the vicinity of clinical facts evaluation. Machine learning knowledge of has been proven to be effective in helping in making selections and predicting from the huge amount of facts produced by using the healthcare enterprise. on this paper, we advocate a unique technique that pursuits via finding good sized functions by means of applying ML strategies ensuing in improving the accuracy inside the prediction of heart ailment. The severity of the heart disease is classified primarily based on diverse methods like KNN, choice timber and so on. The prediction version is added with special combos of capabilities and several known classification techniques. We produce a stronger performance level with an accuracy level of a 100% through the prediction version for heart ailment with the Hybrid Random forest area with a linear model (HRFLM).


2021 ◽  
Author(s):  
Praveeen Anandhanathan ◽  
Priyanka Gopalan

Abstract Coronavirus disease (COVID-19) is spreading across the world. Since at first it has appeared in Wuhan, China in December 2019, it has become a serious issue across the globe. There are no accurate resources to predict and find the disease. So, by knowing the past patients’ records, it could guide the clinicians to fight against the pandemic. Therefore, for the prediction of healthiness from symptoms Machine learning techniques can be implemented. From this we are going to analyse only the symptoms which occurs in every patient. These predictions can help clinicians in the easier manner to cure the patients. Already for prediction of many of the diseases, techniques like SVM (Support vector Machine), Fuzzy k-Means Clustering, Decision Tree algorithm, Random Forest Method, ANN (Artificial Neural Network), KNN (k-Nearest Neighbour), Naïve Bayes, Linear Regression model are used. As we haven’t faced this disease before, we can’t say which technique will give the maximum accuracy. So, we are going to provide an efficient result by comparing all the such algorithms in RStudio.


The online discussion forums and blogs are very vibrant platforms for cancer patients to express their views in the form of stories. These stories sometimes become a source of inspiration for some patients who are anxious in searching the similar cases. This paper proposes a method using natural language processing and machine learning to analyze unstructured texts accumulated from patient’s reviews and stories. The proposed methodology aims to identify behavior, emotions, side-effects, decisions and demographics associated with the cancer victims. The pre-processing phase of our work involves extraction of web text followed by text-cleaning where some special characters and symbols are omitted, and finally tagging the texts using NLTK’s (Natural Language Toolkit) POS (Parts of Speech) Tagger. The post-processing phase performs training of seven machine learning classifiers (refer Table 6). The Decision Tree classifier shows the higher precision (0.83) among the other classifiers while, the Area under the operating Characteristics (AUC) for Support Vector Machine (SVM) classifier is highest (0.98).


2021 ◽  
pp. 1-11
Author(s):  
Jesús Miguel García-Gorrostieta ◽  
Aurelio López-López ◽  
Samuel González-López ◽  
Adrián Pastor López-Monroy

Academic theses writing is a complex task that requires the author to be skilled in argumentation. The goal of the academic author is to communicate clear ideas and to convince the reader of the presented claims. However, few students are good arguers, and this is a skill that takes time to master. In this paper, we present an exploration of lexical features used to model automatic detection of argumentative paragraphs using machine learning techniques. We present a novel proposal, which combines the information in the complete paragraph with the detection of argumentative segments in order to achieve improved results for the detection of argumentative paragraphs. We propose two approaches; a more descriptive one, which uses the decision tree classifier with indicators and lexical features; and another more efficient, which uses an SVM classifier with lexical features and a Document Occurrence Representation (DOR). Both approaches consider the detection of argumentative segments to ensure that a paragraph detected as argumentative has indeed segments with argumentation. We achieved encouraging results for both approaches.


Sign in / Sign up

Export Citation Format

Share Document