scholarly journals Training of Machine Learning Models for Recurrence Prediction in Patients with Respiratory Pathologies

2021 ◽  
Vol 7 (1) ◽  
pp. 20
Author(s):  
Ainhoa Molinero Rodríguez ◽  
Carla Guerra Tort ◽  
Victoria Suárez Ulloa ◽  
José M. López Gestal ◽  
Javier Pereira ◽  
...  

Information extracted from electronic health records (EHRs) is used for predictive tasks and clinical pattern recognition. Machine learning techniques also allow the extraction of knowledge from EHR. This study is a continuation of previous work in which EHRs were exploited to make predictions about patients with respiratory diseases. In this study, we will try to predict the recurrence of patients with respiratory diseases using four different machine learning algorithms.

2020 ◽  
Vol 17 (8) ◽  
pp. 3776-3781
Author(s):  
M. Adimoolam ◽  
Raghav Sharma ◽  
A. John ◽  
M. Suresh Kumar ◽  
K. Ashok Kumar

In the past few decades human beings have knowledgeable tremendous intensification in the interaction in particular micro blogging websites and various social media as online resources. Many kinds of data have been used and classification data to group and store are challenging in this real world scenario. Various machine and Natural Language Processing (NLP) were being applied to analysis the sentiment. A major concentration of this work was on using several machine learning algorithms to perform sentimental analysis and comparing various machine learning models for the sentiment classification. This work analysed various sentimental using multiple classifications. From the evaluation of this experiment, it can be concluded that NLP and machine learning Techniques are efficient for sentimental analysis.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 44-44
Author(s):  
Dan Tulpan

Abstract This is a hands-on workshop offered as a pre-conference training opportunity for researchers interested in applying machine learning techniques to animal science datasets with the purpose of classifying, clustering, performing linear and non-linear regressions or selecting a subset of features relevant to further studies. The objective of this workshop is to provide the audience with a way to formulate a problem such that it will be solvable by machine learning techniques and apply an exploratory analysis of various machine learning algorithms on different datasets. The workshop is structured in a hands-on format and includes a brief overview of basic notions about machine learning, a description of relevant models and evaluation metrics followed by a practical session. The practical session requires each attendee to bring their own laptop and have already installed the Waikato Environment for Knowledge Analysis (Weka) workbench for machine learning available from https://www.cs.waikato.ac.nz/ml/weka/ and all freely available machine learning models. The Weka installation of freely available machine learning models can be achieved by using the Weka Package Manager available from the Tools menu in the main application. Detailed information will be provided before the beginning of the workshop at the following URL: http://animalbiosciences.uoguelph.ca/~dtulpan/conferences/asas2021_mlworkshop/


2020 ◽  
Vol 28 (2) ◽  
pp. 253-265 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Amauri Duarte da Silva ◽  
Walter Filgueira de Azevedo

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.


2021 ◽  
Vol 11 (3) ◽  
pp. 1323
Author(s):  
Medard Edmund Mswahili ◽  
Min-Jeong Lee ◽  
Gati Lother Martin ◽  
Junghyun Kim ◽  
Paul Kim ◽  
...  

Cocrystals are of much interest in industrial application as well as academic research, and screening of suitable coformers for active pharmaceutical ingredients is the most crucial and challenging step in cocrystal development. Recently, machine learning techniques are attracting researchers in many fields including pharmaceutical research such as quantitative structure-activity/property relationship. In this paper, we develop machine learning models to predict cocrystal formation. We extract descriptor values from simplified molecular-input line-entry system (SMILES) of compounds and compare the machine learning models by experiments with our collected data of 1476 instances. As a result, we found that artificial neural network shows great potential as it has the best accuracy, sensitivity, and F1 score. We also found that the model achieved comparable performance with about half of the descriptors chosen by feature selection algorithms. We believe that this will contribute to faster and more accurate cocrystal development.


Author(s):  
Daniel Elton ◽  
Zois Boukouvalas ◽  
Mark S. Butrico ◽  
Mark D. Fuge ◽  
Peter W. Chung

We present a proof of concept that machine learning techniques can be used to predict the properties of CNOHF energetic molecules from their molecular structures. We focus on a small but diverse dataset consisting of 109 molecular structures spread across ten compound classes. Up until now, candidate molecules for energetic materials have been screened using predictions from expensive quantum simulations and thermochemical codes. We present a comprehensive comparison of machine learning models and several molecular featurization methods - sum over bonds, custom descriptors, Coulomb matrices, bag of bonds, and fingerprints. The best featurization was sum over bonds (bond counting), and the best model was kernel ridge regression. Despite having a small data set, we obtain acceptable errors and Pearson correlations for the prediction of detonation pressure, detonation velocity, explosive energy, heat of formation, density, and other properties out of sample. By including another dataset with 309 additional molecules in our training we show how the error can be pushed lower, although the convergence with number of molecules is slow. Our work paves the way for future applications of machine learning in this domain, including automated lead generation and interpreting machine learning models to obtain novel chemical insights.


Author(s):  
Pratik Vyas ◽  
Diptangshu Pandit

The use of machine learning techniques in predictive health care is on the rise with minimal data used for training machine-learning models to derive high accuracy predictions. In this paper, we propose such a system, which utilizes Heart Rate Variability (HRV) as features for training machine learning models. This paper further benchmarks the usefulness of HRV as features calculated from basic heart-rate data using a window shifting method. The benchmarking has been conducted using different machine-learning classifiers such as artificial neural network, decision tree, k-nearest neighbour and naive bays classifier. Empirical results using MIT-BIH Arrhythmia database shows that the proposed system can be used for highly efficient predictability of abnormality in heartbeat data series.


2020 ◽  
Vol 2 (2) ◽  
pp. 106-119
Author(s):  
Subasish Das ◽  
Minh Le ◽  
Boya Dai

Abstract Crash occurrence is a complex phenomenon, and crashes associated with pedestrians and bicyclists are even more complex. Furthermore, pedestrian- and bicyclist-involved crashes are typically not reported in detail in state or national crash databases. To address this issue, developers created the Pedestrian and Bicycle Crash Analysis Tool (PBCAT). However, it is labour-intensive to manually identify the types of pedestrian and bicycle crash from crash-narrative reports and to classify different crash attributes from the textual content of police reports. Therefore, there is a need for a supporting tool that can assist practitioners in using PBCAT more efficiently and accurately. The objective of this study is to develop a framework for applying machine-learning models to classify crash types from unstructured textual content. In this study, the research team collected pedestrian crash-typing data from two locations in Texas. The XGBoost model was found to be the best classifier. The high prediction power of the XGBoost classifiers indicates that this machine-learning technique was able to classify pedestrian crash types with the highest accuracy rate (up to 77% for training data and 72% for test data). The findings demonstrate that advanced machine-learning models can extract underlying patterns and trends of crash mechanisms. This provides the basis for applying machine-learning techniques in addressing the crash typing issues associated with non-motorist crashes.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 44-45
Author(s):  
Dan Tulpan

Abstract This is a hands-on workshop offered as a pre-conference training opportunity for researchers interested in applying machine learning techniques to animal science datasets with the purpose of classifying, clustering, performing linear and non-linear regressions or selecting a subset of features relevant to further studies. The objective of this workshop is to provide the audience with a way to formulate a problem such that it will be solvable by machine learning techniques and apply an exploratory analysis of various machine learning on different datasets. The workshop is structured in a hands-on format and includes a brief overview of basic notions about machine learning, a description of relevant models and evaluation metrics followed by a practical session. The practical session requires each attendee to bring their own laptop and have already installed the Waikato Environment for Knowledge Analysis (Weka) workbench for machine learning available from https://www.cs.waikato.ac.nz/ml/weka/ and all freely available machine learning models. The Weka installation of freely available machine learning models can be achieved by using the Weka Package Manager available from the Tools menu in the main application. Detailed information will be provided 2 weeks before the beginning of the workshop (week of July 5, 2020) at the following URL:http://animalbiosciences.uoguelph.ca/~dtulpan/conferences/asas2020_mlworkshop/


Author(s):  
Antonio Bella ◽  
Cèsar Ferri ◽  
José Hernández-Orallo ◽  
María José Ramírez-Quintana

The evaluation of machine learning models is a crucial step before their application because it is essential to assess how well a model will behave for every single case. In many real applications, not only is it important to know the “total” or the “average” error of the model, it is also important to know how this error is distributed and how well confidence or probability estimations are made. Many current machine learning techniques are good in overall results but have a bad distribution assessment of the error. For these cases, calibration techniques have been developed as postprocessing techniques in order to improve the probability estimation or the error distribution of an existing model. This chapter presents the most common calibration techniques and calibration measures. Both classification and regression are covered, and a taxonomy of calibration techniques is established. Special attention is given to probabilistic classifier calibration.


2021 ◽  
Author(s):  
Salman Sadeg Deumah ◽  
Wahib Ali Yahya ◽  
Abbas Mohamed Al-khudafi ◽  
Khaled Saeed Ba-Jaalah ◽  
Waleed Tawfeeq Al-Absi

Abstract Gas viscosity is an important physical property that controls and influences the flow of gas through porous media and pipe networks. An accurate gas viscosity model is essential for use with reservoir and process simulators. The objective of this study is to assess the predictability of gas viscosity of Yemeni gas fields using machine learning techniques. Performance of some machine learning techniques in the prediction of gas viscosity investigated in this work. The techniques include K-nearest neighbors (KNN), Random Forest (RF), Multiple Linear Regression (MLR), and Decision Tree (DT). About 440 data points were collected from different Yemeni gas fields were used to develop the machine-learning model. The input data used in the training include pressure, temperature, gas density, specific gravity, gas formation volume factor, gas deviation factor, gas molecular weight, pseudo-reduced temperature and pressure, pseudo-critical temperature and pressure, and non-hydrocarbon gas components (N2, CO2, and H2S). Part of the data (75%) was used to train the developed models using the algorithms while another part of the data (25%) was used to predict the viscosity of gas for samples. Trained machine learning models were constructed using the Python programming language. The performance and accuracy of the machine learning models were tested and compared their results based on four different functional input datasets. The result of this study found that that the DT model predicted the gas viscosity with higher accuracy, and gave very good results better than other models based on input parameters of the dataset (A) and (B). This was evidenced by lower the Root mean square error (0.000832), lower mean absolute percent relative error (0.042%), and higher coefficient of determination (R2=0.9465). The proposed approach in the present study provides an accurate and inexpensive model for estimating the viscosity of gases as a function of all input parameters of the dataset (A). Overall, the relative effects of these different input parameters have verified that the gas viscosity has the uppermost relevant to the gas density and specific gravity that have the highest percentage of 51%.


Sign in / Sign up

Export Citation Format

Share Document