scholarly journals Machine Learning Models Identify Inhibitors of SARS-CoV-2

2020 ◽  
Author(s):  
Victor O. Gawriljuk ◽  
Phyo Phyo Kyaw Zin ◽  
Daniel H. Foil ◽  
Jean Bernatchez ◽  
Sungjun Beck ◽  
...  

AbstractWith the ongoing SARS-CoV-2 pandemic there is an urgent need for the discovery of a treatment for the coronavirus disease (COVID-19). Drug repurposing is one of the most rapid strategies for addressing this need and numerous compounds have been selected for in vitro testing by several groups already. These have led to a growing database of molecules with in vitro activity against the virus. Machine learning models can assist drug discovery through prediction of the best compounds based on previously published data. Herein we have implemented several machine learning methods to develop predictive models from recent SARS-CoV-2 in vitro inhibition data and used them to prioritize additional FDA approved compounds for in vitro testing selected from our in-house compound library. From the compounds predicted with a Bayesian machine learning model, CPI1062 and CPI1155 showed antiviral activity in HeLa-ACE2 cell-based assays and represent potential repurposing opportunities for COVID-19. This approach can be greatly expanded to exhaustively virtually screen available molecules with predicted activity against this virus as well as a prioritization tool for SARS-CoV-2 antiviral drug discovery programs. The very latest model for SARS-CoV-2 is available at www.assaycentral.org.

ACS Omega ◽  
2019 ◽  
Vol 4 (1) ◽  
pp. 2353-2361 ◽  
Author(s):  
Manu Anantpadma ◽  
Thomas Lane ◽  
Kimberley M. Zorn ◽  
Mary A. Lingerfelt ◽  
Alex M. Clark ◽  
...  

Author(s):  
Eelke B. Lenselink ◽  
Pieter F. W. Stouten

AbstractAccurate prediction of lipophilicity—logP—based on molecular structures is a well-established field. Predictions of logP are often used to drive forward drug discovery projects. Driven by the SAMPL7 challenge, in this manuscript we describe the steps that were taken to construct a novel machine learning model that can predict and generalize well. This model is based on the recently described Directed-Message Passing Neural Networks (D-MPNNs). Further enhancements included: both the inclusion of additional datasets from ChEMBL (RMSE improvement of 0.03), and the addition of helper tasks (RMSE improvement of 0.04). To the best of our knowledge, the concept of adding predictions from other models (Simulations Plus logP and [email protected], respectively) as helper tasks is novel and could be applied in a broader context. The final model that we constructed and used to participate in the challenge ranked 2/17 ranked submissions with an RMSE of 0.66, and an MAE of 0.48 (submission: Chemprop). On other datasets the model also works well, especially retrospectively applied to the SAMPL6 challenge where it would have ranked number one out of all submissions (RMSE of 0.35). Despite the fact that our model works well, we conclude with suggestions that are expected to improve the model even further.


PLoS ONE ◽  
2015 ◽  
Vol 10 (10) ◽  
pp. e0141076 ◽  
Author(s):  
Sean Ekins ◽  
Peter B. Madrid ◽  
Malabika Sarker ◽  
Shao-Gang Li ◽  
Nisha Mittal ◽  
...  

2018 ◽  
Vol 211 ◽  
pp. 17009
Author(s):  
Natalia Espinoza Sepulveda ◽  
Jyoti Sinha

The development of technologies for the maintenance industry has taken an important role to meet the demanding challenges. One of the important challenges is to predict the defects, if any, in machines as early as possible to manage the machines downtime. The vibration-based condition monitoring (VCM) is well-known for this purpose but requires the human experience and expertise. The machine learning models using the intelligent systems and pattern recognition seem to be the future avenue for machine fault detection without the human expertise. Several such studies are published in the literature. This paper is also on the machine learning model for the different machine faults classification and detection. Here the time domain and frequency domain features derived from the measured machine vibration data are used separated in the development of the machine learning models using the artificial neutral network method. The effectiveness of both the time and frequency domain features based models are compared when they are applied to an experimental rig. The paper presents the proposed machine learning models and their performance in terms of the observations and results.


In pharmaceutical research, traditional drug discovery process is time consuming and expensive, where several compounds are experimentally tested for their biological activities. Series of lab experiments are conducted to analyze newly synthesized drug’s pharmaceutical activities and its biological effects on human. With every new drug discovery, the required clinical properties can be determined using machine learning models and this greatly reduces the experimental cost. This paper explores parametric and non-parametric machine learning models to classify administration properties of drugs and its toxicity. The multinomial classification of drugs was based on their physicochemical and ADMET properties. Balanced data samples were drawn from chEMBL and was pre-processed. Features were reduced using Recursive Feature Elimination and the attributes were ranked based on their importance to reduce highly correlated attributes. The performance of parametric and non-parametric machine learning models was analyzed on cheminformatic data that includes physiochemical, biological and pharmaceutical properties of the drug molecules. Selecting the potent drug candidate along with its administration properties greatly reduces wet lab experimental time and cost. Multiclass classification can be determined efficiently using non-parametric machine learning model. Optimal feature engineering, tuning hyperparameters and adopting hybrid algorithms would result in more accurate predictions in future for cheminformatics data.


Data is the most crucial component of a successful ML system. Once a machine learning model is developed, it gets obsolete over time due to presence of new input data being generated every second. In order to keep our predictions accurate we need to find a way to keep our models up to date. Our research work involves finding a mechanism which can retrain the model with new data automatically. This research also involves exploring the possibilities of automating machine learning processes. We started this project by training and testing our model using conventional machine learning methods. The outcome was then compared with the outcome of those experiments conducted using the AutoML methods like TPOT. This helped us in finding an efficient technique to retrain our models. These techniques can be used in areas where people do not deal with the actual working of a ML model but only require the outputs of ML processes


2016 ◽  
Vol 7 (2) ◽  
pp. 43-71 ◽  
Author(s):  
Sangeeta Lal ◽  
Neetu Sardana ◽  
Ashish Sureka

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.


Sign in / Sign up

Export Citation Format

Share Document