scholarly journals Selecting Machine-Learning Scoring Functions for Structure-Based Virtual Screening

Author(s):  
Pedro Ballester

Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.

2020 ◽  
Author(s):  
Pedro Ballester

Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.


2018 ◽  
Vol 11 (3) ◽  
pp. 1513-1519 ◽  
Author(s):  
R. Ani ◽  
Roshini Manohar ◽  
Gayathri Anil ◽  
O.S. Deepa

In earlier years, the Drug discovery process took years to identify and process a Drug. It takes a normal of 12 years for a Drug to travel from the research lab to the patient. With the introduction of Machine Learning in Drug discovery, the whole process turned out to be simple. The utilization of computational tools in the early stages of Drug development has expanded in recent decades. A computational procedure carried out in Drug discovery process is Virtual Screening (VS). VS are used to identify the compounds which can bind to a Drug target. The preliminary process before analyzing the bonding of ligand and drug protein target is the prediction of drug likeness of compounds. The main objective of this study is to predict Drug likeness properties of Drug compounds based on molecular descriptor information using Tree based ensembles. In this study, many classification algorithms are analyzed and the accuracy for the prediction of drug likeness is calculated. The study shows that accuracy of rotation forest outperforms the accuracy of other classification algorithms in the prediction of drug likeness of chemical compounds. The measured accuracies of the Rotation Forest, Random Forest, Support Vector Machines, KNN, Decision Tree and Naïve Bayes are 98%, 97%, 94.8%, 92.8%, 91.4%, 89.5% respectively.


Author(s):  
Gurusamy Mariappan ◽  
Anju Kumari

Virtual screening plays an important role in the modern drug discovery process. The pharma companies invest huge amounts of money and time in drug discovery and screening. However, at the final stage of clinical trials, several molecules fail, which results in a large financial loss. To overcome this, a virtual screening tool was developed with super predictive power. The virtual screening tool is not only restricted tool small molecules but also to macromolecules such as protein, enzyme, receptors, etc. This gives an insight into structure-based and Ligand-based drug design. VS gives reliable information to direct the process of drug discovery (e.g., when the 3D image of the receptor is known, structure-based drug design is recommended). The pharmacophore-based model is advisable when the information about the receptor or any macromolecule is unknown. In this ADME, parameters such as Log P, bioavailability, and QSAR can be used as filters. This chapter shows both models with various representative examples that facilitate the scientist to use computational screening tools in modern drug discovery processes.


Cells ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 3139
Author(s):  
Fazileh Esmaeili ◽  
Tahmineh Lohrasebi ◽  
Manijeh Mohammadi-Dehcheshmeh ◽  
Esmaeil Ebrahimie

Predicting cancer cells’ response to a plant-derived agent is critical for the drug discovery process. Recently transcriptomes advancements have provided an opportunity to identify regulatory signatures to predict drug activity. Here in this study, a combination of meta-analysis and machine learning models have been used to determine regulatory signatures focusing on differentially expressed transcription factors (TFs) of herbal components on cancer cells. In order to increase the size of the dataset, six datasets were combined in a meta-analysis from studies that had evaluated the gene expression in cancer cell lines before and after herbal extract treatments. Then, categorical feature analysis based on the machine learning methods was applied to examine transcription factors in order to find the best signature/pattern capable of discriminating between control and treated groups. It was found that this integrative approach could recognize the combination of TFs as predictive biomarkers. It was observed that the random forest (RF) model produced the best combination rules, including AIP/TFE3/VGLL4/ID1 and AIP/ZNF7/DXO with the highest modulating capacity. As the RF algorithm combines the output of many trees to set up an ultimate model, its predictive rules are more accurate and reproducible than other trees. The discovered regulatory signature suggests an effective procedure to figure out the efficacy of investigational herbal compounds on particular cells in the drug discovery process.


2022 ◽  
Vol 15 (1) ◽  
pp. 63
Author(s):  
Natarajan Arul Murugan ◽  
Artur Podobas ◽  
Davide Gadioli ◽  
Emanuele Vitali ◽  
Gianluca Palermo ◽  
...  

Drug discovery is the most expensive, time-demanding, and challenging project in biopharmaceutical companies which aims at the identification and optimization of lead compounds from large-sized chemical libraries. The lead compounds should have high-affinity binding and specificity for a target associated with a disease, and, in addition, they should have favorable pharmacodynamic and pharmacokinetic properties (grouped as ADMET properties). Overall, drug discovery is a multivariable optimization and can be carried out in supercomputers using a reliable scoring function which is a measure of binding affinity or inhibition potential of the drug-like compound. The major problem is that the number of compounds in the chemical spaces is huge, making the computational drug discovery very demanding. However, it is cheaper and less time-consuming when compared to experimental high-throughput screening. As the problem is to find the most stable (global) minima for numerous protein–ligand complexes (on the order of 106 to 1012), the parallel implementation of in silico virtual screening can be exploited to ensure drug discovery in affordable time. In this review, we discuss such implementations of parallelization algorithms in virtual screening programs. The nature of different scoring functions and search algorithms are discussed, together with a performance analysis of several docking softwares ported on high-performance computing architectures.


Author(s):  
Diana M. Herrera-Ibatá

: Recently different authors have reported Perturbation Theory (PT) methods combined with machine learning (ML) to obtain PTML (PT + ML) models. They have applied PTML models to the study of different biological systems. Here we present one state-of-art review about the different applications of PTML models in Organic Synthesis, Medicinal Chemistry, Protein Research, and Technology. The aim of the models is to find relations between the molecular descriptors and the biological characteristics to predict key properties of new compounds. An area where the ML has been very useful is the drug discovery process. The entire process of drug discovery leads to the generation of lots of data, and it is also a costly and time-consuming process. ML comes with the opportunity of analyzing great amounts of chemical data obtaining outcomes to find potential drug candidates.


Author(s):  
Suresh Kumar ◽  
Samiyara Begum ◽  
Hemant Kumar Srivastava

Computational techniques are important in the field of drug discovery. These techniques are generally categorized in two methods namely ‘structure-based’ and ‘ligand-based’ methods. The present review discusses the theory of the most important methods, recent successful applications, pharmacophore modeling and quantitative structure-activity relationship (QSAR) studies. A brief introduction of molecular docking methods and their development and applications in drug discovery process is also included. Basic theories and fundamental techniques including sampling algorithms and scoring functions are discussed.


Sign in / Sign up

Export Citation Format

Share Document