scholarly journals Cardiac Disease Prediction using Supervised Machine Learning Techniques.

2022 ◽  
Vol 2161 (1) ◽  
pp. 012013
Author(s):  
Chiradeep Gupta ◽  
Athina Saha ◽  
N V Subba Reddy ◽  
U Dinesh Acharya

Abstract Diagnosis of cardiac disease requires being more accurate, precise, and reliable. The number of death cases due to cardiac attacks is increasing exponentially day by day. Thus, practical approaches for earlier diagnosis of cardiac or heart disease are done to achieve prompt management of the disease. Various supervised machine learning techniques like K-Nearest Neighbour, Decision Tree, Logistic Regression, Naïve Bayes, and Support Vector Machine (SVM) model are used for predicting cardiac disease using a dataset that was collected from the repository of the University of California, Irvine (UCI). The results depict that Logistic Regression was better than all other supervised classifiers in terms of the performance metrics. The model is also less risky since the number of false negatives is low as compared to other models as per the confusion matrix of all the models. In addition, ensemble techniques can be approached for the accuracy improvement of the classifier. Jupyter notebook is the best tool, for the implementation of Python Programming having many types of libraries, header files, for accurate and precise work.

2019 ◽  
Vol 23 (1) ◽  
pp. 12-21 ◽  
Author(s):  
Shikha N. Khera ◽  
Divya

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.


Author(s):  
V Umarani ◽  
A Julian ◽  
J Deepa

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.


2021 ◽  
Vol 297 ◽  
pp. 01073
Author(s):  
Sabyasachi Pramanik ◽  
K. Martin Sagayam ◽  
Om Prakash Jena

Cancer has been described as a diverse illness with several distinct subtypes that may occur simultaneously. As a result, early detection and forecast of cancer types have graced essentially in cancer fact-finding methods since they may help to improve the clinical treatment of cancer survivors. The significance of categorizing cancer suffers into higher or lower-threat categories has prompted numerous fact-finding associates from the bioscience and genomics field to investigate the utilization of machine learning (ML) algorithms in cancer diagnosis and treatment. Because of this, these methods have been used with the goal of simulating the development and treatment of malignant diseases in humans. Furthermore, the capacity of machine learning techniques to identify important characteristics from complicated datasets demonstrates the significance of these technologies. These technologies include Bayesian networks and artificial neural networks, along with a number of other approaches. Decision Trees and Support Vector Machines which have already been extensively used in cancer research for the creation of predictive models, also lead to accurate decision making. The application of machine learning techniques may undoubtedly enhance our knowledge of cancer development; nevertheless, a sufficient degree of validation is required before these approaches can be considered for use in daily clinical practice. An overview of current machine learning approaches utilized in the simulation of cancer development is presented in this paper. All of the supervised machine learning approaches described here, along with a variety of input characteristics and data samples, are used to build the prediction models. In light of the increasing trend towards the use of machine learning methods in biomedical research, we offer the most current papers that have used these approaches to predict risk of cancer or patient outcomes in order to better understand cancer.


Sensors ◽  
2019 ◽  
Vol 19 (10) ◽  
pp. 2266 ◽  
Author(s):  
Nikolaos Sideris ◽  
Georgios Bardis ◽  
Athanasios Voulodimos ◽  
Georgios Miaoulis ◽  
Djamchid Ghazanfarpour

The constantly increasing amount and availability of urban data derived from varying sources leads to an assortment of challenges that include, among others, the consolidation, visualization, and maximal exploitation prospects of the aforementioned data. A preeminent problem affecting urban planning is the appropriate choice of location to host a particular activity (either commercial or common welfare service) or the correct use of an existing building or empty space. In this paper, we propose an approach to address these challenges availed with machine learning techniques. The proposed system combines, fuses, and merges various types of data from different sources, encodes them using a novel semantic model that can capture and utilize both low-level geometric information and higher level semantic information and subsequently feeds them to the random forests classifier, as well as other supervised machine learning models for comparisons. Our experimental evaluation on multiple real-world data sets comparing the performance of several classifiers (including Feedforward Neural Networks, Support Vector Machines, Bag of Decision Trees, k-Nearest Neighbors and Naïve Bayes), indicated the superiority of Random Forests in terms of the examined performance metrics (Accuracy, Specificity, Precision, Recall, F-measure and G-mean).


2018 ◽  
Author(s):  
Sandip S Panesar ◽  
Rhett N D’Souza ◽  
Fang-Cheng Yeh ◽  
Juan C Fernandez-Miranda

AbstractBackgroundMachine learning (ML) is the application of specialized algorithms to datasets for trend delineation, categorization or prediction. ML techniques have been traditionally applied to large, highly-dimensional databases. Gliomas are a heterogeneous group of primary brain tumors, traditionally graded using histopathological features. Recently the World Health Organization proposed a novel grading system for gliomas incorporating molecular characteristics. We aimed to study whether ML could achieve accurate prognostication of 2-year mortality in a small, highly-dimensional database of glioma patients.MethodsWe applied three machine learning techniques: artificial neural networks (ANN), decision trees (DT), support vector machine (SVM), and classical logistic regression (LR) to a dataset consisting of 76 glioma patients of all grades. We compared the effect of applying the algorithms to the raw database, versus a database where only statistically significant features were included into the algorithmic inputs (feature selection).ResultsRaw input consisted of 21 variables, and achieved performance of (accuracy/AUC): 70.7%/0.70 for ANN, 68%/0.72 for SVM, 66.7%/0.64 for LR and 65%/0.70 for DT. Feature selected input consisted of 14 variables and achieved performance of 73.4%/0.75 for ANN, 73.3%/0.74 for SVM, 69.3%/0.73 for LR and 65.2%/0.63 for DT.ConclusionsWe demonstrate that these techniques can also be applied to small, yet highly-dimensional datasets. Our ML techniques achieved reasonable performance compared to similar studies in the literature. Though local databases may be small versus larger cancer repositories, we demonstrate that ML techniques can still be applied to their analysis, though traditional statistical methods are of similar benefit.


The advancement in cyber-attack technologies have ushered in various new attacks which are difficult to detect using traditional intrusion detection systems (IDS).Existing IDS are trained to detect known patterns because of which newer attacks bypass the current IDS and go undetected. In this paper, a two level framework is proposed which can be used to detect unknown new attacks using machine learning techniques. In the first level the known types of classes for attacks are determined using supervised machine learning algorithms such as Support Vector Machine (SVM) and Neural networks (NN). The second level uses unsupervised machine learning algorithms such as K-means. The experimentation is carried out with four models with NSL- KDD dataset in Openstack cloud environment. The Model with Support Vector Machine for supervised machine learning, Gradual Feature Reduction (GFR) for feature selection and K-means for unsupervised algorithm provided the optimum efficiency of 94.56 %.


2020 ◽  
Author(s):  
Abdulhameed Ado Osi ◽  
Hussaini Garba Dikko ◽  
Mannir Abdu ◽  
Auwalu Ibrahim ◽  
Lawan Adamu Isma'il ◽  
...  

COVID-19 is an infectious disease discovered after the outbreak began in Wuhan, China, in December 2019. COVID-19 is still becoming an increasing global threat to public health. The virus has been escalated to many countries across the globe. This paper analyzed and compared the performance of three different supervised machine learning techniques; Linear Discriminant Analysis (LDA), Random Forest (RF), and Support Vector Machine (SVM) on COVID-19 dataset. The best level of accuracy between these three algorithms was determined by comparison of some metrics for assessing predictive performance such as accuracy, sensitivity, specificity, F-score, Kappa index, and ROC. From the analysis results, RF was found to be the best algorithm with 100% prediction accuracy in comparison with LDA and SVM with 95.2% and 90.9% respectively. Our analysis shows that out of these three classification models RF predicts COVID-19 patient's survival outcome with the highest accuracy. Chi-square test reveals that all the seven features except sex were significantly correlated with the COVID-19 patient's outcome (P-value < 0.005). Therefore, RF was recommended for COVID-19 patient outcome prediction that will help in early identification of possible sensitive cases for quick provision of quality health care, support and supervision.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 403
Author(s):  
Muhammad Waleed ◽  
Tai-Won Um ◽  
Tariq Kamal ◽  
Syed Muhammad Usman

In this paper, we apply the multi-class supervised machine learning techniques for classifying the agriculture farm machinery. The classification of farm machinery is important when performing the automatic authentication of field activity in a remote setup. In the absence of a sound machine recognition system, there is every possibility of a fraudulent activity taking place. To address this need, we classify the machinery using five machine learning techniques—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Gradient Boosting (GB). For training of the model, we use the vibration and tilt of machinery. The vibration and tilt of machinery are recorded using the accelerometer and gyroscope sensors, respectively. The machinery included the leveler, rotavator and cultivator. The preliminary analysis on the collected data revealed that the farm machinery (when in operation) showed big variations in vibration and tilt, but observed similar means. Additionally, the accuracies of vibration-based and tilt-based classifications of farm machinery show good accuracy when used alone (with vibration showing slightly better numbers than the tilt). However, the accuracies improve further when both (the tilt and vibration) are used together. Furthermore, all five machine learning algorithms used for classification have an accuracy of more than 82%, but random forest was the best performing. The gradient boosting and random forest show slight over-fitting (about 9%), but both algorithms produce high testing accuracy. In terms of execution time, the decision tree takes the least time to train, while the gradient boosting takes the most time.


2021 ◽  
Author(s):  
Alan Lopes de Sousa Freitas ◽  
Ana Silvia Degasperi Ieker ◽  
Josiane Melchiori Pinheiro ◽  
Wilson Rinaldi ◽  
Heloise Manica Paris Teixeira

Cardiometabolic diseases, developed throughout the worker’s life,such as hypertension, diabetes, dyslipidemia and obesity are amongthe main causes of death and are associated with modifiable andcontrollable risk factors. The general objective of this study wasto apply supervised Machine Learning techniques and to comparetheir performance to predict the risk of developing cardiometabolicdisease from servers working at the School Hospital of south inBrazil. We sought to map the characteristics of individuals who aremore likely to develop cardiometabolic diseases. The machine learningmodels evaluated were Naive Bayes, Decision Tree, RandomForest, KNN, Logistic Regression and SVM. The results obtained inthe experiments showed that some supervised machine learningmodels produce a good classification, depending on the attributesand hyperparameters used.


Sign in / Sign up

Export Citation Format

Share Document