Abstract 21: LiP-MS, a machine learning-based chemoproteomic approach to identify drug targets in complex proteomes

Background: Analysis of atomic coordinates of protein-ligand complexes can provide three-dimensional data to generate computational models to evaluate binding affinity and thermodynamic state functions. Application of machine learning techniques can create models to assess protein-ligand potential energy and binding affinity. These methods show superior predictive performance when compared with classical scoring functions available in docking programs. Objective: Our purpose here is to review the development and application of the program SAnDReS. We describe the creation of machine learning models to assess the binding affinity of protein-ligand complexes. Method: SAnDReS implements machine learning methods available in the scikit-learn library. This program is available for download at https://github.com/azevedolab/sandres. SAnDReS uses crystallographic structures, binding, and thermodynamic data to create targeted scoring functions. Results: Recent applications of the program SAnDReS to drug targets such as Coagulation factor Xa, cyclin-dependent kinases, and HIV-1 protease were able to create targeted scoring functions to predict inhibition of these proteins. These targeted models outperform classical scoring functions. Conclusion: Here, we reviewed the development of machine learning scoring functions to predict binding affinity through the application of the program SAnDReS. Our studies show the superior predictive performance of the SAnDReS-developed models when compared with classical scoring functions available in the programs such as AutoDock4, Molegro Virtual Docker, and AutoDock Vina.

Download Full-text

Machine Learning for Prediction of Drug Targets in Microbe Associated Cardiovascular Diseases by Incorporating Host‐pathogen Interaction Network Parameters

Molecular Informatics ◽

10.1002/minf.202100115 ◽

2021 ◽

pp. 2100115

Author(s):

Nirupma Singh ◽

Sonika Bhatnagar

Keyword(s):

Machine Learning ◽

Cardiovascular Diseases ◽

Drug Targets ◽

Interaction Network ◽

Host Pathogen Interaction ◽

Network Parameters ◽

Host Pathogen

Download Full-text

Compound2Drug – a Machine/deep Learning Tool for Predicting the Bioactivity of PubChem Compounds

10.26434/chemrxiv.13052951 ◽

2020 ◽

Author(s):

Ben Geoffrey A S ◽

Pavan Preetham Valluri ◽

Akhil Sanker ◽

Rafal Madaj ◽

Host Antony Davidd ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Molecular Docking ◽

Drug Target ◽

Drug Targets ◽

Learning Algorithms ◽

Network Data ◽

Ligand Interaction ◽

Pubchem Compound ◽

Protein Ligand Interaction

Network data is composed of nodes and edges. Successful application of machine learning/deep learning algorithms on network data to make node classification and link prediction has been shown in the area of social networks through which highly customized suggestions are offered to social network users. Similarly one can attempt the use of machine learning/deep learning algorithms on biological network data to generate predictions of scientific usefulness. In the present work, compound-drug target interaction data set from bindingDB has been used to train machine learning/deep learning algorithms which are used to predict the drug targets for any PubChem compound queried by the user. The user is required to input the PubChem Compound ID (CID) of the compound the user wishes to gain information about its predicted biological activity and the tool outputs the RCSB PDB IDs of the predicted drug target. The tool also incorporates a feature to perform automated In Silico modelling for the compounds and the predicted drug targets to uncover their protein-ligand interaction profiles. The programs fetches the structures of the compound and the predicted drug targets, prepares them for molecular docking using standard AutoDock Scripts that are part of MGLtools and performs molecular docking, protein-ligand interaction profiling of the targets and the compound and stores the visualized results in the working folder of the user. The program is hosted, supported and maintained at the following GitHub repository <a href="https://github.com/bengeof/Compound2Drug">https://github.com/bengeof/Compound2Drug</a>

Download Full-text

The Application of Machine Learning Techniques in Protein Drugs and Drug Targets Recognition

Current Drug Metabolism ◽

10.2174/138920022003190424105144 ◽

2019 ◽

Vol 20 (3) ◽

pp. 168-169

Author(s):

Hui Ding

Keyword(s):

Machine Learning ◽

Drug Targets ◽

Machine Learning Techniques ◽

Protein Drugs ◽

Learning Techniques

Download Full-text

A systematic approach to prioritize drug targets using machine learning, a molecular descriptor-based classification model, and high-throughput screening of plant derived molecules: a case study in oral cancer

Molecular BioSystems ◽

10.1039/c5mb00468c ◽

2015 ◽

Vol 11 (12) ◽

pp. 3362-3377 ◽

Cited By ~ 4

Author(s):

Vinay Randhawa ◽

Anil Kumar Singh ◽

Vishal Acharya

Keyword(s):

Machine Learning ◽

Oral Cancer ◽

High Throughput ◽

High Throughput Screening ◽

Drug Targets ◽

Systematic Approach ◽

Molecular Descriptor ◽

Classification Model

Network-based and cheminformatics approaches identify novel lead molecules forCXCR4, a key gene prioritized in oral cancer.

Download Full-text

Abstract 4647: Identifying drug targets in sarcoma using machine learning and cell phenotype-based compound screening

10.1158/1538-7445.am2018-4647 ◽

2018 ◽

Author(s):

Eric J. Lachacz ◽

Zhi Fen Wu ◽

John L. Bixby ◽

Vance P. Lemmon ◽

Sofia D. Merajver ◽

...

Keyword(s):

Machine Learning ◽

Drug Targets ◽

Cell Phenotype ◽

Compound Screening

Download Full-text

Machine Learning Methods in Drug Discovery

Molecules ◽

10.3390/molecules25225277 ◽

2020 ◽

Vol 25 (22) ◽

pp. 5277

Author(s):

Lauv Patel ◽

Tripti Shukla ◽

Xiuzhen Huang ◽

David W. Ussery ◽

Shanzhi Wang

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Drug Discovery ◽

High Throughput Screening ◽

Drug Targets ◽

Learning Algorithms ◽

Machine Learning Techniques ◽

Online Information ◽

Drug Candidates ◽

Novel Drug

The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed.

Download Full-text

Genomic landscape of metastatic hormone sensitive prostate cancer (mHSPC) vs. metastatic castration-refractory prostate cancer (mCRPC) by circulating tumor DNA (ctDNA).

Journal of Clinical Oncology ◽

10.1200/jco.2019.37.15_suppl.5043 ◽

2019 ◽

Vol 37 (15_suppl) ◽

pp. 5043-5043

Author(s):

Andrew W Hahn ◽

Edwin Lin ◽

John Esther ◽

Neysi Anderson ◽

Nityam Rathi ◽

...

Keyword(s):

Prostate Cancer ◽

Machine Learning ◽

Drug Targets ◽

Circulating Tumor Dna ◽

Erk Pathway ◽

Genomic Landscape ◽

Significant Difference ◽

Chi Squared ◽

Clinical Annotation ◽

Hormone Sensitive

5043 Background: mCRPC carries a poor prognosis, and targeted therapies have had minimal success in mCRPC. Novel genomic targets could improve drug development. To date, large ctDNA studies in metastatic prostate cancer have been descriptive with limited or no clinical annotation. Herein, we hypothesize that profiles of genomic alterations (GAs) in ctDNA not only differ significantly between, but can also be used to predict mCRPC vs. mHSPC. These findings could help identify new drug targets for mCRPC treatment. Methods: Men with mHSPC or mCRPC who underwent NGS of ctDNA using G360 (Guardant Health Inc.) at the Huntsman Cancer Institute were included. Men were classified as mCRPC or mHSPC (patients with current or no prior ADT). G360 detects somatic mutations in selected exons of 73 genes, amplifications in 18 genes, and selected fusions in 6 genes. Two-sided students t-test was used to compare the %cfDNA and total GAs. The Chi squared test was used to compare the frequency of each GA. Machine learning (ML) algorithms were trained on GAs and benchmarked by cross-validated performance. GAs contributing to mCRPC vs. mHSPC classification were measured by ML feature importance (e.g. odds ratios, regression coefficients). Results: Of the 259 men included, 119 men had mHSPC and 140 had mCRPC. Men with mCRPC had more GAs (4.5 vs. 1.86, p<0.0001) and higher %cfDNA (9.56% vs. 5.02%, p=0.02). In mHSPC, there was no significant difference in the number of GAs or %cfDNA between men on ADT and those who hadn’t yet started ADT. ML algorithms used GAs to predict mCRPC with 78.1% sensitivity, 64.0% specificity, 76.7% PPV, 65.1% NPV, and 70.3% overall accuracy. mCRPC was enriched with GAs in AR, ARID1A, BRAF, BRCA2, CCNE1, CTNNB1, EGFR, FGFR1, KIT, MET, MYC, PDGFRB, PIK3CA, and TP53. Of note, many of these genes are involved in MAP/ERK signaling. Conclusions: Men with mCRPC have more GAs, higher %cfDNA, and enrichment of GAs in the MAP/ERK pathway compared to men with mHSPC. The distinct GAs seen in mCRPC represent novel therapeutic targets, especially in the MAP/ERK pathway. We also show that machine learning can differentiate mHSPC and mCRPC based on GAs detected in ctDNA.

Download Full-text

Genetic analysis of coronary artery disease using tree-based automated machine learning informed by biology-based feature selection

10.1101/2021.03.23.436652 ◽

2021 ◽

Author(s):

Elisabetta Manduchi ◽

Trang T. Le ◽

Weixuan Fu ◽

Jason H. Moore

Keyword(s):

Machine Learning ◽

Coronary Artery Disease ◽

Coronary Artery ◽

Drug Targets ◽

Nucleotide Polymorphisms ◽

Individual Level ◽

Importance Analysis ◽

Artery Disease ◽

The Uk ◽

The Right

AbstractMachine Learning (ML) approaches are increasingly being used in biomedical applications. Important challenges of ML include choosing the right algorithm and tuning the parameters for optimal performance. Automated ML (AutoML) methods, such as Tree-based Pipeline Optimization Tool (TPOT), have been developed to take some of the guesswork out of ML thus making this technology available to users from more diverse backgrounds. The goals of this study were to assess applicability of TPOT to genomics and to identify combinations of single nucleotide polymorphisms (SNPs) associated with coronary artery disease (CAD), with a focus on genes with high likelihood of being good CAD drug targets. We leveraged public functional genomic resources to group SNPs into biologically meaningful sets to be selected by TPOT. We applied this strategy to data from the UK Biobank, detecting a strikingly recurrent signal stemming from a group of 28 SNPs. Importance analysis of these uncovered functional relevance of the top SNPs to genes whose association with CAD is supported in the literature and other resources. Furthermore, we employed game-theory based metrics to study SNP contributions to individual level TPOT predictions and discover distinct clusters of well-predicted CAD cases. The latter indicates a promising approach towards precision medicine.

Download Full-text

Machine learning prediction of oncology drug targets based on protein and network properties

10.21203/rs.2.15798/v1 ◽

2019 ◽

Author(s):

Zoltan Dezso ◽

Michele Ceccarelli

Keyword(s):

Machine Learning ◽

Clinical Trial ◽

Drug Target ◽

Drug Targets ◽

Validation Dataset ◽

Learning Approach ◽

Biological Functions ◽

Machine Learning Approach ◽

Network Properties ◽

Trial Drug

Abstract Background The selection and prioritization of drug targets is a central problem in drug discovery. Computational approaches can leverage the growing number of large-scale human genomics and proteomics data to make in-silico target identification, reducing the cost and the time needed. Results We developed a machine learning approach to score proteins to generate a druggability score of novel targets. In our model we incorporated 70 protein features which included properties derived from the sequence, features characterizing protein functions as well as network properties derived from the protein-protein interaction network. The advantage of this approach is that it is unbiased and even less studied proteins with limited information about their function can score well as most of the features are independent of the accumulated literature. We build models on a training set which consist of targets with approved drugs and a negative set of non-drug targets. The machine learning techniques help to identify the most important combination of features differentiating validated targets from non-targets. We validated our predictions on an independent set of clinical trial drug targets, achieving a high accuracy characterized by an AUC of 0.89. Our most predictive features included biological function of proteins, network centrality measures, protein essentiality, tissue specificity, localization and solvent accessibility. Our predictions, based on a small set of 102 validated oncology targets, recovered the majority of known drug targets and identifies a novel set of proteins as drug target candidates. Conclusions We developed a machine learning approach to prioritize proteins according to their similarity to approved drug targets. We have shown that the method proposed is highly predictive on a validation dataset consisting of 277 targets of clinical trial drug confirming that our computational approach is an efficient and cost-effective tool for drug target discovery and prioritization. Our predictions were based on oncology targets and cancer relevant biological functions, resulting in significantly higher scores for targets of oncology clinical trial drugs compared to the scores of targets of trial drugs for other indications. Our approach can be used to make indication specific drug-target prediction by combining generic druggability features with indication specific biological functions.

Download Full-text