scholarly journals Aprendizado de Máquina Aplicado à Predição de Doenças Cardiometabólicas com Utilização de Indicadores Metabólicos e Comportamentais de Risco à Saúde

Author(s):  
Alan Lopes de Sousa Freitas ◽  
Ana Silvia Degasperi Ieker ◽  
Josiane Melchiori Pinheiro ◽  
Wilson Rinaldi ◽  
Heloise Manica Paris Teixeira

Cardiometabolic diseases, developed throughout the worker’s life,such as hypertension, diabetes, dyslipidemia and obesity are amongthe main causes of death and are associated with modifiable andcontrollable risk factors. The general objective of this study wasto apply supervised Machine Learning techniques and to comparetheir performance to predict the risk of developing cardiometabolicdisease from servers working at the School Hospital of south inBrazil. We sought to map the characteristics of individuals who aremore likely to develop cardiometabolic diseases. The machine learningmodels evaluated were Naive Bayes, Decision Tree, RandomForest, KNN, Logistic Regression and SVM. The results obtained inthe experiments showed that some supervised machine learningmodels produce a good classification, depending on the attributesand hyperparameters used.

Author(s):  
M. Carr ◽  
V. Ravi ◽  
G. Sridharan Reddy ◽  
D. Veranna

This paper profiles mobile banking users using machine learning techniques viz. Decision Tree, Logistic Regression, Multilayer Perceptron, and SVM to test a research model with fourteen independent variables and a dependent variable (adoption). A survey was conducted and the results were analysed using these techniques. Using Decision Trees the profile of the mobile banking adopter’s profile was identified. Comparing different machine learning techniques it was found that Decision Trees outperformed the Logistic Regression and Multilayer Perceptron and SVM. Out of all the techniques, Decision Tree is recommended for profiling studies because apart from obtaining high accurate results, it also yields ‘if–then’ classification rules. The classification rules provided here can be used to target potential customers to adopt mobile banking by offering them appropriate incentives.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 403
Author(s):  
Muhammad Waleed ◽  
Tai-Won Um ◽  
Tariq Kamal ◽  
Syed Muhammad Usman

In this paper, we apply the multi-class supervised machine learning techniques for classifying the agriculture farm machinery. The classification of farm machinery is important when performing the automatic authentication of field activity in a remote setup. In the absence of a sound machine recognition system, there is every possibility of a fraudulent activity taking place. To address this need, we classify the machinery using five machine learning techniques—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Gradient Boosting (GB). For training of the model, we use the vibration and tilt of machinery. The vibration and tilt of machinery are recorded using the accelerometer and gyroscope sensors, respectively. The machinery included the leveler, rotavator and cultivator. The preliminary analysis on the collected data revealed that the farm machinery (when in operation) showed big variations in vibration and tilt, but observed similar means. Additionally, the accuracies of vibration-based and tilt-based classifications of farm machinery show good accuracy when used alone (with vibration showing slightly better numbers than the tilt). However, the accuracies improve further when both (the tilt and vibration) are used together. Furthermore, all five machine learning algorithms used for classification have an accuracy of more than 82%, but random forest was the best performing. The gradient boosting and random forest show slight over-fitting (about 9%), but both algorithms produce high testing accuracy. In terms of execution time, the decision tree takes the least time to train, while the gradient boosting takes the most time.


2021 ◽  
Vol 36 (1) ◽  
pp. 609-615
Author(s):  
Mandhapati Rajesh ◽  
Dr.K. Malathi

Aim: Predicting the Heartdiseases using medical parameters of cardiac patients to get a good accuracy rate using machine learning methods like innovative Decision Tree (DT) algorithm. Materials and Methods: Supervised Machine learning Techniques with innovative Decision Tree (N = 20) and K Nearest Neighbour (KNN) (N = 20) are performed with five different datasets at each time to record five samples. Results: The Decision Tree is used to predict heart disease with the help of various medical conditions, the accuracy is achieved for DT is 98% and KNN is 72.2%. The two algorithms Decision Tree and KNN are statistically insignificant (=.737) with the independent sample T-Test value (p<0.005) with a confidence level of 95%. Conclusion: Prediction and classification of heart disease significantly seem to be better in DT than KNN.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012013
Author(s):  
Chiradeep Gupta ◽  
Athina Saha ◽  
N V Subba Reddy ◽  
U Dinesh Acharya

Abstract Diagnosis of cardiac disease requires being more accurate, precise, and reliable. The number of death cases due to cardiac attacks is increasing exponentially day by day. Thus, practical approaches for earlier diagnosis of cardiac or heart disease are done to achieve prompt management of the disease. Various supervised machine learning techniques like K-Nearest Neighbour, Decision Tree, Logistic Regression, Naïve Bayes, and Support Vector Machine (SVM) model are used for predicting cardiac disease using a dataset that was collected from the repository of the University of California, Irvine (UCI). The results depict that Logistic Regression was better than all other supervised classifiers in terms of the performance metrics. The model is also less risky since the number of false negatives is low as compared to other models as per the confusion matrix of all the models. In addition, ensemble techniques can be approached for the accuracy improvement of the classifier. Jupyter notebook is the best tool, for the implementation of Python Programming having many types of libraries, header files, for accurate and precise work.


2021 ◽  
pp. 1-11
Author(s):  
Daniel A. Harris ◽  
Kyla L. Pyndiura ◽  
Shelby L. Sturrock ◽  
Rebecca A.G. Christensen

Money laundering is a pervasive legal and economic problem that hides criminal activity. Identifying money laundering is a priority for both banks and governments, thus, machine learning algorithms have emerged as a possible strategy to detect suspicious financial activity within financial institutions. We used traditional regression and supervised machine learning techniques to identify bank customers at an increased risk of committing money laundering. Specifically, we assessed whether model performance differed across varying operationalizations of the outcome (e.g., multinomial vs. binary classification) and determined whether the inclusion of investigator-derived novel features (e.g., averages across existing features) could improve model performance. We received two proprietary datasets from Scotiabank, a large bank headquartered in Canada. The datasets included customer account information (N = 4,469) and customers’ monthly transaction histories (N = 2,827) from April 15, 2019 to April 15, 2020. We implemented traditional logistic regression, logistic regression with LASSO regularization (LASSO), K-nearest neighbours (KNN), and extreme gradient boosted models (XGBoost). Results indicated that traditional logistic regression with a binary outcome, conducted with investigator-derived novel features, performed the best with an F1 score of 0.79 and accuracy of 0.72. Models with a binary outcome had higher accuracy than the multinomial models, but the F1 scores yielded mixed results. For KNN and XGBoost, we observed little change or worsening performance after the introduction of the investigator-derived novel features. However, the investigator-derived novel features improved model performance for LASSO and traditional logistic regression. Our findings demonstrate that investigators should consider different operationalizations of the outcome, where possible, and include novel features derived from existing features to potentially improve the detection of customer at risk of committing money laundering.


Generally, Air pollution alludes to the issue of toxins into the air that are harmful to human well being and the entire planet. It can be described as one of the most dangerous threats that the humanity ever faced. It causes damage to animals, crops, forests etc. To prevent this problem in transport sectors have to predict air quality from pollutants using machine learning techniques. Subsequently, air quality assessment and prediction has turned into a significant research zone. The aim is to investigate machine learning based techniques for air quality prediction. The air quality dataset is preprocessed with respect to univariate analysis, bi-variate and multi-variate analysis, missing value treatments, data validation, data cleaning/preparing. Then, air quality is predicted using supervised machine learning techniques like Logistic Regression, Random Forest, K-Nearest Neighbors, Decision Tree and Support Vector Machines. The performance of various machine learning algorithms is compared with respect to Precision, Recall and F1 Score. It is found that Decision Tree algorithm works well for predicting air quality. This application can help the meteorological Department in predicting air quality. In future, this work can be optimized by applying Artificial Intelligence techniques.


2020 ◽  
Vol 28 (2) ◽  
pp. 253-265 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Amauri Duarte da Silva ◽  
Walter Filgueira de Azevedo

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.


Author(s):  
Augusto Cerqua ◽  
Roberta Di Stefano ◽  
Marco Letta ◽  
Sara Miccoli

AbstractEstimates of the real death toll of the COVID-19 pandemic have proven to be problematic in many countries, Italy being no exception. Mortality estimates at the local level are even more uncertain as they require stringent conditions, such as granularity and accuracy of the data at hand, which are rarely met. The “official” approach adopted by public institutions to estimate the “excess mortality” during the pandemic draws on a comparison between observed all-cause mortality data for 2020 and averages of mortality figures in the past years for the same period. In this paper, we apply the recently developed machine learning control method to build a more realistic counterfactual scenario of mortality in the absence of COVID-19. We demonstrate that supervised machine learning techniques outperform the official method by substantially improving the prediction accuracy of the local mortality in “ordinary” years, especially in small- and medium-sized municipalities. We then apply the best-performing algorithms to derive estimates of local excess mortality for the period between February and September 2020. Such estimates allow us to provide insights about the demographic evolution of the first wave of the pandemic throughout the country. To help improve diagnostic and monitoring efforts, our dataset is freely available to the research community.


Sign in / Sign up

Export Citation Format

Share Document