Machine Learning Models Identify Inhibitors of SARS-CoV-2

AbstractWith the ongoing SARS-CoV-2 pandemic there is an urgent need for the discovery of a treatment for the coronavirus disease (COVID-19). Drug repurposing is one of the most rapid strategies for addressing this need and numerous compounds have been selected for in vitro testing by several groups already. These have led to a growing database of molecules with in vitro activity against the virus. Machine learning models can assist drug discovery through prediction of the best compounds based on previously published data. Herein we have implemented several machine learning methods to develop predictive models from recent SARS-CoV-2 in vitro inhibition data and used them to prioritize additional FDA approved compounds for in vitro testing selected from our in-house compound library. From the compounds predicted with a Bayesian machine learning model, CPI1062 and CPI1155 showed antiviral activity in HeLa-ACE2 cell-based assays and represent potential repurposing opportunities for COVID-19. This approach can be greatly expanded to exhaustively virtually screen available molecules with predicted activity against this virus as well as a prioritization tool for SARS-CoV-2 antiviral drug discovery programs. The very latest model for SARS-CoV-2 is available at www.assaycentral.org.

Download Full-text

Ebola Virus Bayesian Machine Learning Models Enable New in Vitro Leads

ACS Omega ◽

10.1021/acsomega.8b02948 ◽

2019 ◽

Vol 4 (1) ◽

pp. 2353-2361 ◽

Cited By ~ 17

Author(s):

Manu Anantpadma ◽

Thomas Lane ◽

Kimberley M. Zorn ◽

Mary A. Lingerfelt ◽

Alex M. Clark ◽

...

Keyword(s):

Machine Learning ◽

Ebola Virus ◽

Learning Models ◽

Bayesian Machine Learning ◽

Machine Learning Models

Download Full-text

Multitask machine learning models for predicting lipophilicity (logP) in the SAMPL7 challenge

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-021-00405-6 ◽

2021 ◽

Author(s):

Eelke B. Lenselink ◽

Pieter F. W. Stouten

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Drug Discovery ◽

Message Passing ◽

Learning Model ◽

Molecular Structures ◽

Learning Models ◽

Final Model ◽

Machine Learning Model ◽

Machine Learning Models

AbstractAccurate prediction of lipophilicity—logP—based on molecular structures is a well-established field. Predictions of logP are often used to drive forward drug discovery projects. Driven by the SAMPL7 challenge, in this manuscript we describe the steps that were taken to construct a novel machine learning model that can predict and generalize well. This model is based on the recently described Directed-Message Passing Neural Networks (D-MPNNs). Further enhancements included: both the inclusion of additional datasets from ChEMBL (RMSE improvement of 0.03), and the addition of helper tasks (RMSE improvement of 0.04). To the best of our knowledge, the concept of adding predictions from other models (Simulations Plus logP and [email protected], respectively) as helper tasks is novel and could be applied in a broader context. The final model that we constructed and used to participate in the challenge ranked 2/17 ranked submissions with an RMSE of 0.66, and an MAE of 0.48 (submission: Chemprop). On other datasets the model also works well, especially retrospectively applied to the SAMPL6 challenge where it would have ranked number one out of all submissions (RMSE of 0.35). Despite the fact that our model works well, we conclude with suggestions that are expected to improve the model even further.

Download Full-text

Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Models for Mycobacterium tuberculosis Drug Discovery

PLoS ONE ◽

10.1371/journal.pone.0141076 ◽

2015 ◽

Vol 10 (10) ◽

pp. e0141076 ◽

Cited By ~ 6

Author(s):

Sean Ekins ◽

Peter B. Madrid ◽

Malabika Sarker ◽

Shao-Gang Li ◽

Nisha Mittal ◽

...

Keyword(s):

Machine Learning ◽

Mycobacterium Tuberculosis ◽

Drug Discovery ◽

Learning Models ◽

Bayesian Machine Learning ◽

Machine Learning Models

Download Full-text

Comparison of machine learning models based on time domain and frequency domain features for faults diagnosis in rotating machines

MATEC Web of Conferences ◽

10.1051/matecconf/201821117009 ◽

2018 ◽

Vol 211 ◽

pp. 17009

Author(s):

Natalia Espinoza Sepulveda ◽

Jyoti Sinha

Keyword(s):

Machine Learning ◽

Frequency Domain ◽

Time Domain ◽

Intelligent Systems ◽

Learning Models ◽

Machine Vibration ◽

Vibration Data ◽

Machine Learning Model ◽

The Time Domain ◽

Machine Learning Models

The development of technologies for the maintenance industry has taken an important role to meet the demanding challenges. One of the important challenges is to predict the defects, if any, in machines as early as possible to manage the machines downtime. The vibration-based condition monitoring (VCM) is well-known for this purpose but requires the human experience and expertise. The machine learning models using the intelligent systems and pattern recognition seem to be the future avenue for machine fault detection without the human expertise. Several such studies are published in the literature. This paper is also on the machine learning model for the different machine faults classification and detection. Here the time domain and frequency domain features derived from the measured machine vibration data are used separated in the development of the machine learning models using the artificial neutral network method. The effectiveness of both the time and frequency domain features based models are compared when they are applied to an experimental rig. The paper presents the proposed machine learning models and their performance in terms of the observations and results.

Download Full-text

Multivariate Classification of Drugs using Parametric and Nonparametric Machine Learning Models

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8740.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2021-2027

Keyword(s):

Machine Learning ◽

Drug Discovery ◽

Biological Activities ◽

Biological Effects ◽

Recursive Feature Elimination ◽

Drug Candidate ◽

Learning Models ◽

Machine Learning Models ◽

Non Parametric

In pharmaceutical research, traditional drug discovery process is time consuming and expensive, where several compounds are experimentally tested for their biological activities. Series of lab experiments are conducted to analyze newly synthesized drug’s pharmaceutical activities and its biological effects on human. With every new drug discovery, the required clinical properties can be determined using machine learning models and this greatly reduces the experimental cost. This paper explores parametric and non-parametric machine learning models to classify administration properties of drugs and its toxicity. The multinomial classification of drugs was based on their physicochemical and ADMET properties. Balanced data samples were drawn from chEMBL and was pre-processed. Features were reduced using Recursive Feature Elimination and the attributes were ranked based on their importance to reduce highly correlated attributes. The performance of parametric and non-parametric machine learning models was analyzed on cheminformatic data that includes physiochemical, biological and pharmaceutical properties of the drug molecules. Selecting the potent drug candidate along with its administration properties greatly reduces wet lab experimental time and cost. Multiclass classification can be determined efficiently using non-parametric machine learning model. Optimal feature engineering, tuning hyperparameters and adopting hybrid algorithms would result in more accurate predictions in future for cheminformatics data.

Download Full-text

Automated Retraining of Machine Learning Models

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3322.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 445-452

Keyword(s):

Machine Learning ◽

Input Data ◽

Research Work ◽

Learning Models ◽

Machine Learning Methods ◽

Machine Learning Model ◽

Crucial Component ◽

Conventional Machine ◽

Over Time ◽

Machine Learning Models

Data is the most crucial component of a successful ML system. Once a machine learning model is developed, it gets obsolete over time due to presence of new input data being generated every second. In order to keep our predictions accurate we need to find a way to keep our models up to date. Our research work involves finding a mechanism which can retrain the model with new data automatically. This research also involves exploring the possibilities of automating machine learning processes. We started this project by training and testing our model using conventional machine learning methods. The outcome was then compared with the outcome of those experiments conducted using the AutoML methods like TPOT. This helped us in finding an efficient technique to retrain our models. These techniques can be used in areas where people do not deal with the actual working of a ML model but only require the outputs of ML processes

Download Full-text

TOPICAL ISSUES OF APPLICATION OF MACHINE LEARNING METHODS IN ECONOMY

Инновационные аспекты развития науки и техники. Сборник статей VIII Международной научно-практической конференции: сборник статей, [электронное издание сетевого распространения] / Под ред. Н.В. Емельянова. – М.: “КДУ”, “Добросвет”, 2021. – 149 с. ◽

10.31453/kdu.ru.978-5-7913-1176-4-2021-28-33 ◽

2021 ◽

Author(s):

Natalia Pavlovna Persteneva ◽

◽

Darya Dmitrievn Skryleva ◽

Keyword(s):

Machine Learning ◽

Unsupervised Learning ◽

Supervised Learning ◽

Learning Model ◽

Learning Models ◽

Learning Methods ◽

Machine Learning Methods ◽

Machine Learning Model ◽

Popular Classes ◽

Machine Learning Models

The article discusses machine learning methods. Using the example of two popular classes: supervised learning and unsupervised learning. Variants of the main types of machine learning models for each method are presented. A generalized algorithm for building any machine learning model is formed.

Download Full-text

Improving Logging Prediction on Imbalanced Datasets

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2016040103 ◽

2016 ◽

Vol 7 (2) ◽

pp. 43-71 ◽

Cited By ~ 3

Author(s):

Sangeeta Lal ◽

Neetu Sardana ◽

Ashish Sureka

Keyword(s):

Machine Learning ◽

Open Source ◽

Class Imbalance ◽

Learning Model ◽

Learning Models ◽

Class Imbalance Problem ◽

Imbalanced Datasets ◽

Imbalance Problem ◽

Machine Learning Model ◽

Machine Learning Models

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.

Download Full-text