REDIAL-2020: A Suite of Machine Learning Models to Estimate Anti-SARS-CoV-2 Activities

<p>Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (<a href="https://drugdiscovery.utep.edu/redial">http://drugcentral.org/Redial</a>). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (<i>viral entry</i>, <i>viral replication,</i> and <i>live virus infectivity</i>) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (<a href="https://github.com/sirimullalab/ncats_covid">https://github.com/sirimullalab/</a>redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.</p>

Download Full-text

REDIAL-2020: A Suite of Machine Learning Models to Estimate Anti-SARS-CoV-2 Activities

10.26434/chemrxiv.12915779.v1 ◽

2020 ◽

Author(s):

Govinda KC ◽

Giovanni Bocci ◽

Srijan Verma ◽

Mahmudulla Hassan ◽

Jayme Holmes ◽

...

Keyword(s):

Machine Learning ◽

High Throughput Screening ◽

Web Application ◽

External Validation ◽

Model Development ◽

Machine Learning Algorithms ◽

Virus Infectivity ◽

Learning Models ◽

Live Virus ◽

Machine Learning Models

<p>Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (<a href="https://drugdiscovery.utep.edu/redial">http://drugcentral.org/Redial</a>). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (<i>viral entry</i>, <i>viral replication,</i> and <i>live virus infectivity</i>) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (<a href="https://github.com/sirimullalab/ncats_covid">https://github.com/sirimullalab/</a>redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.</p>

Download Full-text

REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities

10.21203/rs.3.rs-76894/v1 ◽

2020 ◽

Author(s):

Govinda KC ◽

Giovanni Bocci ◽

Srijan Verma ◽

Md Hassan ◽

Jayme Holmes ◽

...

Keyword(s):

Machine Learning ◽

High Throughput Screening ◽

Web Application ◽

External Validation ◽

Model Development ◽

Machine Learning Algorithms ◽

Virus Infectivity ◽

Learning Models ◽

Live Virus ◽

Machine Learning Models

Abstract Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (http://drugcentral.org/Redial). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (viral entry, viral replication, and live virus infectivity) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (https://github.com/sirimullalab/redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.

Download Full-text

Prediction of aircraft estimated time of arrival using machine learning methods

The Aeronautical Journal ◽

10.1017/aer.2021.13 ◽

2021 ◽

pp. 1-15

Author(s):

O. Basturk ◽

C. Cetek

Keyword(s):

Machine Learning ◽

Web Application ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Weather Data ◽

Time Of Arrival ◽

Learning Models ◽

Trajectory Data ◽

Different Sources ◽

Machine Learning Models

ABSTRACT In this study, prediction of aircraft Estimated Time of Arrival (ETA) is proposed using machine learning algorithms. Accurate prediction of ETA is important for management of delay and air traffic flow, runway assignment, gate assignment, collaborative decision making (CDM), coordination of ground personnel and equipment, and optimisation of arrival sequence etc. Machine learning is able to learn from experience and make predictions with weak assumptions or no assumptions at all. In the proposed approach, general flight information, trajectory data and weather data were obtained from different sources in various formats. Raw data were converted to tidy data and inserted into a relational database. To obtain the features for training the machine learning models, the data were explored, cleaned and transformed into convenient features. New features were also derived from the available data. Random forests and deep neural networks were used to train the machine learning models. Both models can predict the ETA with a mean absolute error (MAE) less than 6min after departure, and less than 3min after terminal manoeuvring area (TMA) entrance. Additionally, a web application was developed to dynamically predict the ETA using proposed models.

Download Full-text

A machine learning platform to estimate anti-SARS-CoV-2 activities

10.26434/chemrxiv.12915779.v3 ◽

2021 ◽

Author(s):

Govinda KC ◽

Giovanni Bocci ◽

Srijan Verma ◽

Mahmudulla Hassan ◽

Jayme Holmes ◽

...

Keyword(s):

Machine Learning ◽

Learning Strategies ◽

Viral Entry ◽

High Throughput Screening ◽

Web Application ◽

Computational Models ◽

Virus Infectivity ◽

Live Virus ◽

Learning Platform

<p> </p><div> <div> <div> <div> <p>Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. Here we present "REDIAL-2020", a suite of computational models for estimating small molecule activities in a range of SARS-CoV-2 related assays. Models were trained using publicly available, high throughput screening data and by employing different descriptor types and various machine learning strategies. Here we describe the development and the usage of eleven models spanning across the areas of viral entry, viral replication, live virus infectivity, in vitro infectivity and human cell toxicity. REDIAL-2020 is available as a web application through the DrugCentral web portal (http://drugcentral.org/Redial). In addition, the web-app provides similarity search results that display the most similar molecules to the query, as well as associated experimental data. REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. </p> </div> </div> </div> </div><br><p></p>

Download Full-text

Characterizing and Evaluating the Zoonotic Potential of Novel Viruses Discovered in Vampire Bats

Viruses ◽

10.3390/v13020252 ◽

2021 ◽

Vol 13 (2) ◽

pp. 252

Author(s):

Laura M. Bergner ◽

Nardus Mollentze ◽

Richard J. Orton ◽

Carlos Tello ◽

Alice Broos ◽

...

Keyword(s):

Machine Learning ◽

Phylogenetic Analyses ◽

Human Infection ◽

Machine Learning Algorithms ◽

Zoonotic Potential ◽

Metagenomic Sequencing ◽

Learning Models ◽

Sequencing Data ◽

Vampire Bats ◽

Machine Learning Models

The contemporary surge in metagenomic sequencing has transformed knowledge of viral diversity in wildlife. However, evaluating which newly discovered viruses pose sufficient risk of infecting humans to merit detailed laboratory characterization and surveillance remains largely speculative. Machine learning algorithms have been developed to address this imbalance by ranking the relative likelihood of human infection based on viral genome sequences, but are not yet routinely applied to viruses at the time of their discovery. Here, we characterized viral genomes detected through metagenomic sequencing of feces and saliva from common vampire bats (Desmodus rotundus) and used these data as a case study in evaluating zoonotic potential using molecular sequencing data. Of 58 detected viral families, including 17 which infect mammals, the only known zoonosis detected was rabies virus; however, additional genomes were detected from the families Hepeviridae, Coronaviridae, Reoviridae, Astroviridae and Picornaviridae, all of which contain human-infecting species. In phylogenetic analyses, novel vampire bat viruses most frequently grouped with other bat viruses that are not currently known to infect humans. In agreement, machine learning models built from only phylogenetic information ranked all novel viruses similarly, yielding little insight into zoonotic potential. In contrast, genome composition-based machine learning models estimated different levels of zoonotic potential, even for closely related viruses, categorizing one out of four detected hepeviruses and two out of three picornaviruses as having high priority for further research. We highlight the value of evaluating zoonotic potential beyond ad hoc consideration of phylogeny and provide surveillance recommendations for novel viruses in a wildlife host which has frequent contact with humans and domestic animals.

Download Full-text

Short-Term Prediction of COVID-19 Cases Using Machine Learning Models

Applied Sciences ◽

10.3390/app11094266 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4266

Author(s):

Md. Shahriare Satu ◽

Koushik Chandra Howlader ◽

Mufti Mahmud ◽

M. Shamim Kaiser ◽

Sheikh Mohammad Shariful Islam ◽

...

Keyword(s):

Machine Learning ◽

Web Application ◽

Learning Models ◽

Short Term ◽

Infected People ◽

Infected Case ◽

First Case ◽

Novel Coronavirus ◽

Short Term Prediction ◽

Machine Learning Models

The first case in Bangladesh of the novel coronavirus disease (COVID-19) was reported on 8 March 2020, with the number of confirmed cases rapidly rising to over 175,000 by July 2020. In the absence of effective treatment, an essential tool of health policy is the modeling and forecasting of the progress of the pandemic. We, therefore, developed a cloud-based machine learning short-term forecasting model for Bangladesh, in which several regression-based machine learning models were applied to infected case data to estimate the number of COVID-19-infected people over the following seven days. This approach can accurately forecast the number of infected cases daily by training the prior 25 days sample data recorded on our web application. The outcomes of these efforts could aid the development and assessment of prevention strategies and identify factors that most affect the spread of COVID-19 infection in Bangladesh.

Download Full-text

Model Comparison for Esp Run-Life Prediction: Classic Statistics Vs. Machine Learning

10.2118/206028-ms ◽

2021 ◽

Author(s):

Alejandro Celemín ◽

Diego A. Estupiñan ◽

Ricardo Nieto

Keyword(s):

Machine Learning ◽

Model Comparison ◽

Proportional Hazards ◽

Proportional Hazards Model ◽

Machine Learning Algorithms ◽

Slight Reduction ◽

Learning Models ◽

Validation Data ◽

Operational Conditions ◽

Machine Learning Models

Abstract Electrical Submersible Pumps reliability and run-life analysis has been extensively studied since its development. Current machine learning algorithms allow to correlate operational conditions to ESP run-life in order to generate predictions for active and new wells. Four machine learning models are compared to a linear proportional hazards model, used as a baseline for comparison purposes. Proper accuracy metrics for survival analysis problems are calculated on run-life predictions vs. actual values over training and validation data subsets. Results demonstrate that the baseline model is able to produce more consistent predictions with a slight reduction in its accuracy, compared to current machine learning models for small datasets. This study demonstrates that the quality of the date and it pre-processing supports the current shift from model-centric to data-centric approach to machine and deep learning problems.

Download Full-text

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35742 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3605-3611

Author(s):

Pratyush Kaware

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Low Cost ◽

Learning Algorithms ◽

Cost Effective ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.

Download Full-text