Abstract 4647: Identifying drug targets in sarcoma using machine learning and cell phenotype-based compound screening

Author(s):  
Eric J. Lachacz ◽  
Zhi Fen Wu ◽  
John L. Bixby ◽  
Vance P. Lemmon ◽  
Sofia D. Merajver ◽  
...  
2020 ◽  
Vol 27 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Camila Rizzotto ◽  
Walter Filgueira de Azevedo Junior

Background: Analysis of atomic coordinates of protein-ligand complexes can provide three-dimensional data to generate computational models to evaluate binding affinity and thermodynamic state functions. Application of machine learning techniques can create models to assess protein-ligand potential energy and binding affinity. These methods show superior predictive performance when compared with classical scoring functions available in docking programs. Objective: Our purpose here is to review the development and application of the program SAnDReS. We describe the creation of machine learning models to assess the binding affinity of protein-ligand complexes. Method: SAnDReS implements machine learning methods available in the scikit-learn library. This program is available for download at https://github.com/azevedolab/sandres. SAnDReS uses crystallographic structures, binding, and thermodynamic data to create targeted scoring functions. Results: Recent applications of the program SAnDReS to drug targets such as Coagulation factor Xa, cyclin-dependent kinases, and HIV-1 protease were able to create targeted scoring functions to predict inhibition of these proteins. These targeted models outperform classical scoring functions. Conclusion: Here, we reviewed the development of machine learning scoring functions to predict binding affinity through the application of the program SAnDReS. Our studies show the superior predictive performance of the SAnDReS-developed models when compared with classical scoring functions available in the programs such as AutoDock4, Molegro Virtual Docker, and AutoDock Vina.


2021 ◽  
Author(s):  
Nigel Beaton ◽  
Yuehan Feng ◽  
Roland Bruderer ◽  
Adam Hendricks ◽  
Ghaith Hamza ◽  
...  

2020 ◽  
Author(s):  
Ben Geoffrey A S ◽  
Pavan Preetham Valluri ◽  
Akhil Sanker ◽  
Rafal Madaj ◽  
Host Antony Davidd ◽  
...  

<p>Network data is composed of nodes and edges. Successful application of machine learning/deep learning algorithms on network data to make node classification and link prediction has been shown in the area of social networks through which highly customized suggestions are offered to social network users. Similarly one can attempt the use of machine learning/deep learning algorithms on biological network data to generate predictions of scientific usefulness. In the present work, compound-drug target interaction data set from bindingDB has been used to train machine learning/deep learning algorithms which are used to predict the drug targets for any PubChem compound queried by the user. The user is required to input the PubChem Compound ID (CID) of the compound the user wishes to gain information about its predicted biological activity and the tool outputs the RCSB PDB IDs of the predicted drug target. The tool also incorporates a feature to perform automated <i>In Silico</i> modelling for the compounds and the predicted drug targets to uncover their protein-ligand interaction profiles. The programs fetches the structures of the compound and the predicted drug targets, prepares them for molecular docking using standard AutoDock Scripts that are part of MGLtools and performs molecular docking, protein-ligand interaction profiling of the targets and the compound and stores the visualized results in the working folder of the user. The program is hosted, supported and maintained at the following GitHub repository </p> <p><a href="https://github.com/bengeof/Compound2Drug">https://github.com/bengeof/Compound2Drug</a></p>


2015 ◽  
Vol 11 (12) ◽  
pp. 3362-3377 ◽  
Author(s):  
Vinay Randhawa ◽  
Anil Kumar Singh ◽  
Vishal Acharya

Network-based and cheminformatics approaches identify novel lead molecules forCXCR4, a key gene prioritized in oral cancer.


Molecules ◽  
2020 ◽  
Vol 25 (22) ◽  
pp. 5277
Author(s):  
Lauv Patel ◽  
Tripti Shukla ◽  
Xiuzhen Huang ◽  
David W. Ussery ◽  
Shanzhi Wang

The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. 5043-5043
Author(s):  
Andrew W Hahn ◽  
Edwin Lin ◽  
John Esther ◽  
Neysi Anderson ◽  
Nityam Rathi ◽  
...  

5043 Background: mCRPC carries a poor prognosis, and targeted therapies have had minimal success in mCRPC. Novel genomic targets could improve drug development. To date, large ctDNA studies in metastatic prostate cancer have been descriptive with limited or no clinical annotation. Herein, we hypothesize that profiles of genomic alterations (GAs) in ctDNA not only differ significantly between, but can also be used to predict mCRPC vs. mHSPC. These findings could help identify new drug targets for mCRPC treatment. Methods: Men with mHSPC or mCRPC who underwent NGS of ctDNA using G360 (Guardant Health Inc.) at the Huntsman Cancer Institute were included. Men were classified as mCRPC or mHSPC (patients with current or no prior ADT). G360 detects somatic mutations in selected exons of 73 genes, amplifications in 18 genes, and selected fusions in 6 genes. Two-sided students t-test was used to compare the %cfDNA and total GAs. The Chi squared test was used to compare the frequency of each GA. Machine learning (ML) algorithms were trained on GAs and benchmarked by cross-validated performance. GAs contributing to mCRPC vs. mHSPC classification were measured by ML feature importance (e.g. odds ratios, regression coefficients). Results: Of the 259 men included, 119 men had mHSPC and 140 had mCRPC. Men with mCRPC had more GAs (4.5 vs. 1.86, p<0.0001) and higher %cfDNA (9.56% vs. 5.02%, p=0.02). In mHSPC, there was no significant difference in the number of GAs or %cfDNA between men on ADT and those who hadn’t yet started ADT. ML algorithms used GAs to predict mCRPC with 78.1% sensitivity, 64.0% specificity, 76.7% PPV, 65.1% NPV, and 70.3% overall accuracy. mCRPC was enriched with GAs in AR, ARID1A, BRAF, BRCA2, CCNE1, CTNNB1, EGFR, FGFR1, KIT, MET, MYC, PDGFRB, PIK3CA, and TP53. Of note, many of these genes are involved in MAP/ERK signaling. Conclusions: Men with mCRPC have more GAs, higher %cfDNA, and enrichment of GAs in the MAP/ERK pathway compared to men with mHSPC. The distinct GAs seen in mCRPC represent novel therapeutic targets, especially in the MAP/ERK pathway. We also show that machine learning can differentiate mHSPC and mCRPC based on GAs detected in ctDNA.


2021 ◽  
Author(s):  
Elisabetta Manduchi ◽  
Trang T. Le ◽  
Weixuan Fu ◽  
Jason H. Moore

AbstractMachine Learning (ML) approaches are increasingly being used in biomedical applications. Important challenges of ML include choosing the right algorithm and tuning the parameters for optimal performance. Automated ML (AutoML) methods, such as Tree-based Pipeline Optimization Tool (TPOT), have been developed to take some of the guesswork out of ML thus making this technology available to users from more diverse backgrounds. The goals of this study were to assess applicability of TPOT to genomics and to identify combinations of single nucleotide polymorphisms (SNPs) associated with coronary artery disease (CAD), with a focus on genes with high likelihood of being good CAD drug targets. We leveraged public functional genomic resources to group SNPs into biologically meaningful sets to be selected by TPOT. We applied this strategy to data from the UK Biobank, detecting a strikingly recurrent signal stemming from a group of 28 SNPs. Importance analysis of these uncovered functional relevance of the top SNPs to genes whose association with CAD is supported in the literature and other resources. Furthermore, we employed game-theory based metrics to study SNP contributions to individual level TPOT predictions and discover distinct clusters of well-predicted CAD cases. The latter indicates a promising approach towards precision medicine.


2020 ◽  
Author(s):  
Claire Molony ◽  
Damien King ◽  
Mariana Di Luca ◽  
Abidemi Olayinka ◽  
Roya Hakimjavadi ◽  
...  

AbstractA hallmark of subclinical atherosclerosis is the accumulation of vascular smooth muscle cell (SMC)-like cells leading to intimal thickening and lesion formation. While medial SMCs contribute to vascular lesions, the involvement of resident vascular stem cells (vSCs) remains unclear. We evaluated single cell photonics as a discriminator of cell phenotype in vitro before the presence of vSC within vascular lesions was assessed ex vivo using supervised machine learning and further validated using lineage tracing analysis. Using a novel lab-on-a-Disk (Load) platform, label-free single cell photonic emissions from normal and injured vessels ex vivo were interrogated and compared to freshly isolated aortic SMCs, cultured Movas SMCs, macrophages, B-cells, S100β+ mVSc, bone marrow derived mesenchymal stem cells (MSC) and their respective myogenic progeny across five broadband light wavelengths (λ465 - λ670 ± 20 nm). We found that profiles were of sufficient coverage, specificity, and quality to clearly distinguish medial SMCs from different vascular beds (carotid vs aorta), discriminate normal carotid medial SMCs from lesional SMC-like cells ex vivo following flow restriction, and identify SMC differentiation of a series of multipotent stem cells following treatment with transforming growth factor beta 1 (TGF-β1), the Notch ligand Jagged1, and Sonic Hedgehog using multivariate analysis, in part, due to photonic emissions from enhanced collagen III and elastin expression. Supervised machine learning supported genetic lineage tracing analysis of S100β+ vSCs and identified the presence of S100β+ vSC-derived myogenic progeny within vascular lesions. We conclude disease-relevant photonic signatures may have predictive value for vascular disease.


Sign in / Sign up

Export Citation Format

Share Document