scholarly journals TarDict: A RandomForestClassifier based software predicts drug-target interaction using SMILES

2021 ◽  
pp. bi202101
Author(s):  
Peter Habib ◽  
Alsamman Alsamman ◽  
Sameh Hassanein ◽  
Aladdin Hamwieh

The future of therapeutics depends on understanding the interaction between the chemical structure of the drug and the target protein that contributes to the etiology of the disease in order to improve drug discovery. Predicting the target of unknown drugs being investigated from already identified drug data is very important not only for understanding different processes of drug and molecular interactions but also for the development of new drugs. Using machine learning and published drug information we design an easy-to-use tool that predicts biological target proteins for medical drugs. TarDict is based on a chemical-simplified line-entry molecular input system called SMILES. It receives SMILES entries and returns a list of possible similar drugs as well as possible drug-targets. TarDict uses 20442 drug entries that have well-known biological targets to construct a prognostic computational model capable of predicting novel drug targets with an accuracy of 95%. We developed a machine learning approach to recommend target proteins to approved drug targets. We have shown that the proposed method is highly predictive on a testing dataset consisting of 4088 targets and 102 manually entered drugs. The proposed computational model is an efficient and cost-effective tool for drug target discovery and prioritization. Such novel tool could be used to enhance drug design, predict potential target and identify combination therapy crossroads.

2020 ◽  
Vol 21 (10) ◽  
pp. 790-803 ◽  
Author(s):  
Dongrui Gao ◽  
Qingyuan Chen ◽  
Yuanqi Zeng ◽  
Meng Jiang ◽  
Yongqing Zhang

Drug target discovery is a critical step in drug development. It is the basis of modern drug development because it determines the target molecules related to specific diseases in advance. Predicting drug targets by computational methods saves a great deal of financial and material resources compared to in vitro experiments. Therefore, several computational methods for drug target discovery have been designed. Recently, machine learning (ML) methods in biomedicine have developed rapidly. In this paper, we present an overview of drug target discovery methods based on machine learning. Considering that some machine learning methods integrate network analysis to predict drug targets, network-based methods are also introduced in this article. Finally, the challenges and future outlook of drug target discovery are discussed.


2020 ◽  
Vol 36 (16) ◽  
pp. 4490-4497
Author(s):  
Siqi Liang ◽  
Haiyuan Yu

Abstract Motivation In silico drug target prediction provides valuable information for drug repurposing, understanding of side effects as well as expansion of the druggable genome. In particular, discovery of actionable drug targets is critical to developing targeted therapies for diseases. Results Here, we develop a robust method for drug target prediction by leveraging a class imbalance-tolerant machine learning framework with a novel training scheme. We incorporate novel features, including drug–gene phenotype similarity and gene expression profile similarity that capture information orthogonal to other features. We show that our classifier achieves robust performance and is able to predict gene targets for new drugs as well as drugs that potentially target unexplored genes. By providing newly predicted drug–target associations, we uncover novel opportunities of drug repurposing that may benefit cancer treatment through action on either known drug targets or currently undrugged genes. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Siqi Liang ◽  
Haiyuan Yu

AbstractIn silicodrug target prediction provides valuable information for drug repurposing, understanding of side effects as well as expansion of the druggable genome. In particular, discovery of actionable drug targets is critical to developing targeted therapies for diseases. Here, we develop a robust method for drug target prediction by leveraging a class imbalance-tolerant machine learning framework with a novel training scheme. We incorporate novel features, including drug-gene phenotype similarity and gene expression profile similarity, that capture information orthogonal to other features. We show that our classifier achieves robust performance and is able to predict gene targets for new drugs as well as drugs that target unexplored genes. By providing newly predicted drug-target associations, we uncover novel opportunities of drug repurposing that may benefit cancer treatment through action on either known drug targets or currently undrugged genes.


2021 ◽  
Vol 15 ◽  
pp. 117793222110091
Author(s):  
Badreddine Nouadi ◽  
Abdelkarim Ezaouine ◽  
Mariame El Messal ◽  
Mohamed Blaghen ◽  
Faiza Bennis ◽  
...  

The emerging pathogen SARS-CoV2 causing coronavirus disease 2019 (COVID-19) is a global public health challenge. To the present day, COVID-19 had affected more than 40 million people worldwide. The exploration and the development of new bioactive compounds with cost-effective and specific anti-COVID 19 therapeutic power is the prime focus of the current medical research. Thus, the exploitation of the molecular docking technique has become essential in the discovery and development of new drugs, to better understand drug-target interactions in their original environment. This work consists of studying the binding affinity and the type of interactions, through molecular docking, between 54 compounds from Moroccan medicinal plants, dextran sulfate and heparin (compounds not derived from medicinal plants), and 3CLpro-SARS-CoV-2, ACE2, and the post fusion core of 2019-nCoV S2 subunit. The PDB files of the target proteins and prepared herbal compounds (ligands) were subjected for docking to AutoDock Vina using UCSF Chimera, which provides a list of potential complexes based on the criteria of form complementarity of the natural compound with their binding affinities. The results of molecular docking revealed that Taxol, Rutin, Genkwanine, and Luteolin-glucoside have a high affinity with ACE2 and 3CLpro. Therefore, these natural compounds can have 2 effects at once, inhibiting 3CLpro and preventing recognition between the virus and ACE2. These compounds may have a potential therapeutic effect against SARS-CoV2, and therefore natural anti-COVID-19 compounds.


2020 ◽  
Author(s):  
Ben Geoffrey A S ◽  
Pavan Preetham Valluri ◽  
Akhil Sanker ◽  
Rafal Madaj ◽  
Host Antony Davidd ◽  
...  

<p>Network data is composed of nodes and edges. Successful application of machine learning/deep learning algorithms on network data to make node classification and link prediction has been shown in the area of social networks through which highly customized suggestions are offered to social network users. Similarly one can attempt the use of machine learning/deep learning algorithms on biological network data to generate predictions of scientific usefulness. In the present work, compound-drug target interaction data set from bindingDB has been used to train machine learning/deep learning algorithms which are used to predict the drug targets for any PubChem compound queried by the user. The user is required to input the PubChem Compound ID (CID) of the compound the user wishes to gain information about its predicted biological activity and the tool outputs the RCSB PDB IDs of the predicted drug target. The tool also incorporates a feature to perform automated <i>In Silico</i> modelling for the compounds and the predicted drug targets to uncover their protein-ligand interaction profiles. The programs fetches the structures of the compound and the predicted drug targets, prepares them for molecular docking using standard AutoDock Scripts that are part of MGLtools and performs molecular docking, protein-ligand interaction profiling of the targets and the compound and stores the visualized results in the working folder of the user. The program is hosted, supported and maintained at the following GitHub repository </p> <p><a href="https://github.com/bengeof/Compound2Drug">https://github.com/bengeof/Compound2Drug</a></p>


2019 ◽  
Vol 21 (6) ◽  
pp. 1937-1953 ◽  
Author(s):  
Jussi Paananen ◽  
Vittorio Fortino

Abstract The drug discovery process starts with identification of a disease-modifying target. This critical step traditionally begins with manual investigation of scientific literature and biomedical databases to gather evidence linking molecular target to disease, and to evaluate the efficacy, safety and commercial potential of the target. The high-throughput and affordability of current omics technologies, allowing quantitative measurements of many putative targets (e.g. DNA, RNA, protein, metabolite), has exponentially increased the volume of scientific data available for this arduous task. Therefore, computational platforms identifying and ranking disease-relevant targets from existing biomedical data sources, including omics databases, are needed. To date, more than 30 drug target discovery (DTD) platforms exist. They provide information-rich databases and graphical user interfaces to help scientists identify putative targets and pre-evaluate their therapeutic efficacy and potential side effects. Here we survey and compare a set of popular DTD platforms that utilize multiple data sources and omics-driven knowledge bases (either directly or indirectly) for identifying drug targets. We also provide a description of omics technologies and related data repositories which are important for DTD tasks.


2018 ◽  
Vol 65 (2) ◽  
pp. 209-218 ◽  
Author(s):  
Tariq Ismail ◽  
Nighat Fatima ◽  
Syed Aun Muhammad ◽  
Syed Saoud Zaidi ◽  
Nisar Rehman ◽  
...  

Candida albicans (C. albicans) is one of the major source of nosocomial infections in human which may prove fatal in 30% of cases. The hospital acquired infection is very difficult to affectively treat due to the presence of drug resistant pathogenic strains, therefore there is a need to find alternative drug targets to cure this infection. In silico and computational level frame work was used to prioritize and establish antifungal drug targets of Candida albicans. The identification of putative drug targets was based on acquiring completely 5090 annotated genes of Candida albicans from available databases which was categorized into essential and non-essential genes. The result indicated 9% proteins were essential that could become potential candidates for intervention which might result in pathogen death. We studied cluster of orthologs and the subtractive genomic analysis of these essential proteins against human genome as a reference to minimize the side effects. It was seen that 14% of Candidal proteins were evolutionary related to the human proteins while 86% are non-human homologs. In next step for the selection of compatible drug targets, the non-human homologs were sequentially compared to human microbiome data to minimize the potential effects against gut flora which accumulated to 38% of essential genome. The sub-cellular localization of these candidate proteins in fungal cellular systems exhibited that 80% are cytoplasmic, 10% are mitochondrial and remaining 10 % are associated with cell wall. The role of these non-human and non-gut flora putative target proteins in Candidal biological pathways was studied and on the basis of their integrated and critical role 4-proteins were selected for molecular modeling.  For drug designing and development, five quality and reliable protein models with more than 70% homology were constructed. Our study will be an effective framework for drug target identifications of pathogenic microbial strains and development of new therapies against these infections.


2021 ◽  
Author(s):  
Shengya Cao ◽  
Nadia Martinez-Martin

Technological improvements in unbiased screening have accelerated drug target discovery. In particular, membrane-embedded and secreted proteins have gained attention because of their ability to orchestrate intercellular communication. Dysregulation of their extracellular protein–protein interactions (ePPIs) underlies the initiation and progression of many human diseases. Practically, ePPIs are also accessible for modulation by therapeutics since they operate outside of the plasma membrane. Therefore, it is unsurprising that while these proteins make up about 30% of human genes, they encompass the majority of drug targets approved by the FDA. Even so, most secreted and membrane proteins remain uncharacterized in terms of binding partners and cellular functions. To address this, a number of approaches have been developed to overcome challenges associated with membrane protein biology and ePPI discovery. This chapter will cover recent advances that use high-throughput methods to move towards the generation of a comprehensive network of ePPIs in humans for future targeted drug discovery.


2019 ◽  
Author(s):  
Zoltan Dezso ◽  
Michele Ceccarelli

Abstract Background The selection and prioritization of drug targets is a central problem in drug discovery. Computational approaches can leverage the growing number of large-scale human genomics and proteomics data to make in-silico target identification, reducing the cost and the time needed. Results We developed a machine learning approach to score proteins to generate a druggability score of novel targets. In our model we incorporated 70 protein features which included properties derived from the sequence, features characterizing protein functions as well as network properties derived from the protein-protein interaction network. The advantage of this approach is that it is unbiased and even less studied proteins with limited information about their function can score well as most of the features are independent of the accumulated literature. We build models on a training set which consist of targets with approved drugs and a negative set of non-drug targets. The machine learning techniques help to identify the most important combination of features differentiating validated targets from non-targets. We validated our predictions on an independent set of clinical trial drug targets, achieving a high accuracy characterized by an AUC of 0.89. Our most predictive features included biological function of proteins, network centrality measures, protein essentiality, tissue specificity, localization and solvent accessibility. Our predictions, based on a small set of 102 validated oncology targets, recovered the majority of known drug targets and identifies a novel set of proteins as drug target candidates. Conclusions We developed a machine learning approach to prioritize proteins according to their similarity to approved drug targets. We have shown that the method proposed is highly predictive on a validation dataset consisting of 277 targets of clinical trial drug confirming that our computational approach is an efficient and cost-effective tool for drug target discovery and prioritization. Our predictions were based on oncology targets and cancer relevant biological functions, resulting in significantly higher scores for targets of oncology clinical trial drugs compared to the scores of targets of trial drugs for other indications. Our approach can be used to make indication specific drug-target prediction by combining generic druggability features with indication specific biological functions.


2020 ◽  
Vol 23 (3) ◽  
pp. 253-268
Author(s):  
Shreya Bhattacharya ◽  
Puja Ghosh ◽  
Debasmita Banerjee ◽  
Arundhati Banerjee ◽  
Sujay Ray

Aim and Objective: One of the challenges to conventional therapies against Mycobacterium tuberculosis is the development of multi-drug resistant pathogenic strains. This study was undertaken to explore new therapeutic targets for the revolutionary antivirulence therapy utilizing the pathogen’s essential hypothetical proteins, serving as virulence factors, which is the essential first step in novel drug designing. Methods: Functional annotations of essential hypothetical proteins from Mycobacterium tuberculosis (H37Rv strain) were performed through domain annotation, Gene Ontology analysis, physicochemical characterization and prediction of subcellular localization. Virulence factors among the essential hypothetical proteins were predicted, among which pathogen-specific drug target candidates, non-homologous to human and gut microbiota, were identified. This was followed by druggability and spectrum analysis of the identified targets. Results and conclusion: The study successfully assigned functions of 83 essential hypothetical proteins of Mycobacterium tuberculosis, among which 25 were identified as virulence factors. Out of 25, 12 virulence factors were observed as potential pathogen-specific drug target candidates. Nine potential targets had druggable properties and rest three were considered as novel targets. Exploration of these targets will provide new insights into future drug development. Characterization of subcellular localizations revealed that most of the predicted targets were cytoplasmic which could be ideal for intracellular drugs, while two drug targets were membranebound, ideal for vaccines. Spectrum analysis identified one broad-spectrum and 11 narrowspectrum targets. This study would, therefore, instigate designing novel therapeutics for antivirulence therapy, which have the potential to serve as revolutionary treatment instead of conventional antibiotic therapies to overcome the lethality of antibiotic-resistant strains.


Sign in / Sign up

Export Citation Format

Share Document