scholarly journals SSnet: A Deep Learning Approach for Protein-Ligand Interaction Prediction

2021 ◽  
Vol 22 (3) ◽  
pp. 1392
Author(s):  
Niraj Verma ◽  
Xingming Qu ◽  
Francesco Trozzi ◽  
Mohamed Elsaied ◽  
Nischal Karki ◽  
...  

Computational prediction of Protein-Ligand Interaction (PLI) is an important step in the modern drug discovery pipeline as it mitigates the cost, time, and resources required to screen novel therapeutics. Deep Neural Networks (DNN) have recently shown excellent performance in PLI prediction. However, the performance is highly dependent on protein and ligand features utilized for the DNN model. Moreover, in current models, the deciphering of how protein features determine the underlying principles that govern PLI is not trivial. In this work, we developed a DNN framework named SSnet that utilizes secondary structure information of proteins extracted as the curvature and torsion of the protein backbone to predict PLI. We demonstrate the performance of SSnet by comparing against a variety of currently popular machine and non-Machine Learning (ML) models using various metrics. We visualize the intermediate layers of SSnet to show a potential latent space for proteins, in particular to extract structural elements in a protein that the model finds influential for ligand binding, which is one of the key features of SSnet. We observed in our study that SSnet learns information about locations in a protein where a ligand can bind, including binding sites, allosteric sites and cryptic sites, regardless of the conformation used. We further observed that SSnet is not biased to any specific molecular interaction and extracts the protein fold information critical for PLI prediction. Our work forms an important gateway to the general exploration of secondary structure-based Deep Learning (DL), which is not just confined to protein-ligand interactions, and as such will have a large impact on protein research, while being readily accessible for de novo drug designers as a standalone package.

Author(s):  
Niraj Verma ◽  
Xingming Qu ◽  
Francesco Trozzi ◽  
Mohamed Elsaied ◽  
Nischal Karki ◽  
...  

AbstractComputational prediction of Protein-Ligand Interaction (PLI) is an important step in the modern drug discovery pipeline as it mitigates the cost, time, and resources required to screen novel therapeutics. Deep Neural Networks (DNN) have recently shown excellent performance in PLI prediction. However, the performance is highly dependent on protein and ligand features utilized for the DNN model. Moreover, in current models, the deciphering of how protein features determine the underlying principles that govern PLI is not trivial. In this work, we developed a DNN framework named SSnet that utilizes secondary structure information of proteins extracted as the curvature and torsion of the protein backbone to predict PLI. We demonstrate the performance of SSnet by comparing against a variety of currently popular machine and non-machine learning models using various metrics. We visualize the intermediate layers of SSnet to show a potential latent space for proteins, in particular to extract structural elements in a protein that the model finds influential for ligand binding, which is one of the key features of SSnet. We observed in our study that SSnet learns information about locations in a protein where a ligand can bind including binding sites, allosteric sites and cryptic sites, regardless of the conformation used. We further observed that SSnet is not biased to any specific molecular interaction and extracts the protein fold information critical for PLI prediction. Our work forms an important gateway to the general exploration of secondary structure based deep learning, which is not just confined to protein-ligand interactions, and as such will have a large impact on protein research.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Fan Hu ◽  
Jiaxin Jiang ◽  
Dongqi Wang ◽  
Muchun Zhu ◽  
Peng Yin

AbstractThe assessment of protein–ligand interactions is critical at early stage of drug discovery. Computational approaches for efficiently predicting such interactions facilitate drug development. Recently, methods based on deep learning, including structure- and sequence-based models, have achieved impressive performance on several different datasets. However, their application still suffers from a generalizability issue because of insufficient data, especially for structure based models, as well as a heterogeneity problem because of different label measurements and varying proteins across datasets. Here, we present an interpretable multi-task model to evaluate protein–ligand interaction (Multi-PLI). The model can run classification (binding or not) and regression (binding affinity) tasks concurrently by unifying different datasets. The model outperforms traditional docking and machine learning on both binary classification and regression tasks and achieves competitive results compared with some structure-based deep learning methods, even with the same training set size. Furthermore, combined with the proposed occlusion algorithm, the model can predict the important amino acids of proteins that are crucial for binding, thus providing a biological interpretation.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Tzu-Chieh Hung ◽  
Wen-Yuan Lee ◽  
Kuen-Bao Chen ◽  
Yueh-Chiu Chan ◽  
Calvin Yu-Chian Chen

Recently, an important topic of liver tumorigenesis had been published in 2013. In this report, Ras and Rho had defined the relation of liver tumorigenesis. The traditional Chinese medicine (TCM) database has been screened for molecular compounds by simulating molecular docking and molecular dynamics to regulate Ras and liver tumorigenesis. Saussureamine C, S-allylmercaptocysteine, and Tryptophan are selected based on the highest docking score than other TCM compounds. The molecular dynamics are helpful in the analysis and detection of protein-ligand interactions. Based on the docking poses, hydrophobic interactions, and hydrogen bond variations, this research surmises are the main regions of important amino acids in Ras. In addition to the detection of TCM compound efficacy, we suggest Saussureamine C is better than the others for protein-ligand interaction.


2020 ◽  
Vol 44 (42) ◽  
pp. 18250-18255
Author(s):  
Lunxi Duan ◽  
Hongliang Yao ◽  
Yong Xie ◽  
Ke Pan

Label-free fluorescence monitoring protein–ligand interaction based on binding induced enzymatic cleavage protection.


2017 ◽  
Vol 73 (3) ◽  
pp. 195-202 ◽  
Author(s):  
Daria A. Beshnova ◽  
Joana Pereira ◽  
Victor S. Lamzin

Macromolecular X-ray crystallography is one of the main experimental techniques to visualize protein–ligand interactions. The high complexity of the ligand universe, however, has delayed the development of efficient methods for the automated identification, fitting and validation of ligands in their electron-density clusters. The identification and fitting are primarily based on the density itself and do not take into account the protein environment, which is a step that is only taken during the validation of the proposed binding mode. Here, a new approach, based on the estimation of the major energetic terms of protein–ligand interaction, is introduced for the automated identification of crystallographic ligands in the indicated binding site withARP/wARP. The applicability of the method to the validation of protein–ligand models from the Protein Data Bank is demonstrated by the detection of models that are `questionable' and the pinpointing of unfavourable interatomic contacts.


2012 ◽  
Vol 27 ◽  
pp. 373-379 ◽  
Author(s):  
Olga V. Stepanenko ◽  
Olesya V. Stepanenko ◽  
Alexander V. Fonin ◽  
Vladislav V. Verkhusha ◽  
Irina M. Kuznetsova ◽  
...  

In this paper we have studied peculiarities of protein-ligand interaction under different conditions. We have shown that guanidine hydrochloride (GdnHCI) unfolding-refolding of GGBP in the presence of glucose (Glc) is reversible, but the equilibrium curves of complex refolding-unfolding have been attained only after 10-day incubation of GGBP/Glc in the presence of GdnHCl. This effect has not been revealed at heat-induced GGBP/Glc denaturation. Slow equilibration between the native protein in GGBP/Glc complex and the unfolded state of protein in the GdnHCl presence is connected with increased viscosity of solution at moderate and high GdnHCl concentrations which interferes with diffusion of glucose molecules. Thus, the limiting step of the unfolding-refolding process of the complex GGBP/Glc is the disruption/tuning of the configuration fit between the protein in the native state and the ligand.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Tzu-Chieh Hung ◽  
Wen-Yuan Lee ◽  
Kuen-Bao Chen ◽  
Yueh-Chiu Chan ◽  
Calvin Yu-Chian Chen

Recently, an important topic of breast cancer had been published in 2013. In this report, estrogen receptor (ESR1) had defined the relation of hormone-cause breast cancer. The screening of traditional Chinese medicine (TCM) database has found the molecular compounds by simulating molecular docking and molecular dynamics to regulate ESR1. S-Allylmercaptocysteine and 5-hydroxy-L-tryptophan are selected according to the highest docking score than that of other TCM compounds and Raloxifene (control). The simulation from molecular dynamics is helpful in analyzing and detecting the protein-ligand interactions. After a comparing the control and the Apo form, then based on the docking poses, hydrophobic interactions, hydrogen bond and structure variations, this research postulates that S-allylmercaptocysteine may be more appropriate than other compounds for protein-ligand interaction.


2020 ◽  
Author(s):  
Ben Geoffrey A S ◽  
Pavan Preetham Valluri ◽  
Akhil Sanker ◽  
Rafal Madaj ◽  
Host Antony Davidd ◽  
...  

<p>Network data is composed of nodes and edges. Successful application of machine learning/deep learning algorithms on network data to make node classification and link prediction has been shown in the area of social networks through which highly customized suggestions are offered to social network users. Similarly one can attempt the use of machine learning/deep learning algorithms on biological network data to generate predictions of scientific usefulness. In the present work, compound-drug target interaction data set from bindingDB has been used to train machine learning/deep learning algorithms which are used to predict the drug targets for any PubChem compound queried by the user. The user is required to input the PubChem Compound ID (CID) of the compound the user wishes to gain information about its predicted biological activity and the tool outputs the RCSB PDB IDs of the predicted drug target. The tool also incorporates a feature to perform automated <i>In Silico</i> modelling for the compounds and the predicted drug targets to uncover their protein-ligand interaction profiles. The programs fetches the structures of the compound and the predicted drug targets, prepares them for molecular docking using standard AutoDock Scripts that are part of MGLtools and performs molecular docking, protein-ligand interaction profiling of the targets and the compound and stores the visualized results in the working folder of the user. The program is hosted, supported and maintained at the following GitHub repository </p> <p><a href="https://github.com/bengeof/Compound2Drug">https://github.com/bengeof/Compound2Drug</a></p>


2019 ◽  
Vol 21 (1) ◽  
pp. 24 ◽  
Author(s):  
Dmitry Karasev ◽  
Boris Sobolev ◽  
Alexey Lagunin ◽  
Dmitry Filimonov ◽  
Vladimir Poroikov

The affinity of different drug-like ligands to multiple protein targets reflects general chemical–biological interactions. Computational methods estimating such interactions analyze the available information about the structure of the targets, ligands, or both. Prediction of protein–ligand interactions based on pairwise sequence alignment provides reasonable accuracy if the ligands’ specificity well coincides with the phylogenic taxonomy of the proteins. Methods using multiple alignment require an accurate match of functionally significant residues. Such conditions may not be met in the case of diverged protein families. To overcome these limitations, we propose an approach based on the analysis of local sequence similarity within the set of analyzed proteins. The positional scores, calculated by sequence fragment comparisons, are used as input data for the Bayesian classifier. Our approach provides a prediction accuracy comparable or exceeding those of other methods. It was demonstrated on the popular Gold Standard test sets, presenting different sequence heterogeneity and varying from the group, including different protein families to the more specific groups. A reasonable prediction accuracy was also found for protein kinases, displaying weak relationships between sequence phylogeny and inhibitor specificity. Thus, our method can be applied to the broad area of protein–ligand interactions.


Author(s):  
Lieyang Chen ◽  
Anthony Cruz ◽  
Steven Ramsey ◽  
Callum J. Dickson ◽  
José S. Duca ◽  
...  

<p>Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development. </p>


Sign in / Sign up

Export Citation Format

Share Document