Drug–target prediction utilizing heterogeneous bio-linked network embeddings

Author(s):  
Nansu Zong ◽  
Rachael Sze Nga Wong ◽  
Yue Yu ◽  
Andrew Wen ◽  
Ming Huang ◽  
...  

Abstract To enable modularization for network-based prediction, we conducted a review of known methods conducting the various subtasks corresponding to the creation of a drug–target prediction framework and associated benchmarking to determine the highest-performing approaches. Accordingly, our contributions are as follows: (i) from a network perspective, we benchmarked the association-mining performance of 32 distinct subnetwork permutations, arranging based on a comprehensive heterogeneous biomedical network derived from 12 repositories; (ii) from a methodological perspective, we identified the best prediction strategy based on a review of combinations of the components with off-the-shelf classification, inference methods and graph embedding methods. Our benchmarking strategy consisted of two series of experiments, totaling six distinct tasks from the two perspectives, to determine the best prediction. We demonstrated that the proposed method outperformed the existing network-based methods as well as how combinatorial networks and methodologies can influence the prediction. In addition, we conducted disease-specific prediction tasks for 20 distinct diseases and showed the reliability of the strategy in predicting 75 novel drug–target associations as shown by a validation utilizing DrugBank 5.1.0. In particular, we revealed a connection of the network topology with the biological explanations for predicting the diseases, ‘Asthma’ ‘Hypertension’, and ‘Dementia’. The results of our benchmarking produced knowledge on a network-based prediction framework with the modularization of the feature selection and association prediction, which can be easily adapted and extended to other feature sources or machine learning algorithms as well as a performed baseline to comprehensively evaluate the utility of incorporating varying data sources.

2019 ◽  
Author(s):  
Nansu Zong ◽  
Rachael Sze Nga Wong ◽  
Victoria Ngo ◽  
Yue Yu ◽  
Ning Li

AbstractMotivationDespite the existing classification- and inference-based machine learning methods that show promising results in drug-target prediction, these methods possess inevitable limitations, where: 1) results are often biased as it lacks negative samples in the classification-based methods, and 2) novel drug-target associations with new (or isolated) drugs/targets cannot be explored by inference-based methods. As big data continues to boom, there is a need to study a scalable, robust, and accurate solution that can process large heterogeneous datasets and yield valuable predictions.ResultsWe introduce a drug-target prediction method that improved our previously proposed method from the three aspects: 1) we constructed a heterogeneous network which incorporates 12 repositories and includes 7 types of biomedical entities (#20,119 entities, # 194,296 associations), 2) we enhanced the feature learning method with Node2Vec, a scalable state-of-art feature learning method, 3) we integrate the originally proposed inference-based model with a classification model, which is further fine-tuned by a negative sample selection algorithm. The proposed method shows a better result for drug–target association prediction: 95.3% AUC ROC score compared to the existing methods in the 10-fold cross-validation tests. We studied the biased learning/testing in the network-based pairwise prediction, and conclude a best training strategy. Finally, we conducted a disease specific prediction task based on 20 diseases. New drug-target associations were successfully predicted with AUC ROC in average, 97.2% (validated based on the DrugBank 5.1.0). The experiments showed the reliability of the proposed method in predicting novel drug-target associations for the disease treatment.


2012 ◽  
Vol 2012 ◽  
pp. 1-10 ◽  
Author(s):  
Yong Wang ◽  
Zhongyang Liu ◽  
Chun Li ◽  
Dong Li ◽  
Yulin Ouyang ◽  
...  

In this paper, we present a case study of Qishenkeli (QSKL) to research TCM’s underlying molecular mechanism, based on drug target prediction and analyses of TCM chemical components and following experimental validation. First, after determining the compositive compounds of QSKL, we use drugCIPHER-CS to predict their potential drug targets. These potential targets are significantly enriched with known cardiovascular disease-related drug targets. Then we find these potential drug targets are significantly enriched in the biological processes of neuroactive ligand-receptor interaction, aminoacyl-tRNA biosynthesis, calcium signaling pathway, glycine, serine and threonine metabolism, and renin-angiotensin system (RAAS), and so on. Then, animal model of coronary heart disease (CHD) induced by left anterior descending coronary artery ligation is applied to validate predicted pathway. RAAS pathway is selected as an example, and the results show that QSKL has effect on both rennin and angiotensin II receptor (AT1R), which eventually down regulates the angiotensin II (AngII). Bioinformatics combing with experiment verification can provide a credible and objective method to understand the complicated multitargets mechanism for Chinese herbal formula.


2021 ◽  
Vol 22 (10) ◽  
pp. 5118
Author(s):  
Matthieu Najm ◽  
Chloé-Agathe Azencott ◽  
Benoit Playe ◽  
Véronique Stoven

Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases’ statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken.


2010 ◽  
Vol 21 (4) ◽  
pp. 511-516 ◽  
Author(s):  
Edda Klipp ◽  
Rebecca C Wade ◽  
Ursula Kummer

2020 ◽  
Vol 12 (1) ◽  
Author(s):  
Noé Sturm ◽  
Andreas Mayr ◽  
Thanh Le Van ◽  
Vladimir Chupakhin ◽  
Hugo Ceulemans ◽  
...  

2012 ◽  
Vol 28 (18) ◽  
pp. i611-i618 ◽  
Author(s):  
M. Takarabe ◽  
M. Kotera ◽  
Y. Nishimura ◽  
S. Goto ◽  
Y. Yamanishi

Sign in / Sign up

Export Citation Format

Share Document