Supporting Binding-Sites Discovery via Iterative Database Processing

Author(s):  
Ran Tel-Nir ◽  
Roy Gelbard ◽  
Israel Spiegler

Recognition of binding sites common to a set of protein structures is important for applications such as drug design. Common methods of binding-sites are based on heuristic algorithms that use summarized spatial data and superimposition techniques. However, computational operations generally do not store intermediate data for further calculation and information extraction. The current study presents an alternative approach to binding calculation by introducing a binary representation scheme for three dimensional molecule data and a fast iterative algorithm which obviates the need to calculate and resolve spatial transformations in the binding site extraction process. This is achieved by using relational database indexing methods and an efficient iterative model. This general-purpose iterative algorithm was tested for binding small molecules. The results show that the method can be applied efficiently for binding site extraction, and bio-information extraction. This binary representation improves performance by reducing processing time by 31% compared to typical representations.

2008 ◽  
Vol 06 (02) ◽  
pp. 335-345 ◽  
Author(s):  
ALEKSANDAR POLEKSIC ◽  
MARK FIENUP ◽  
JOSEPH F. DANZER ◽  
DEREK A. DEBE

Measuring the accuracy of protein three-dimensional structures is one of the most important problems in protein structure prediction. For structure-based drug design, the accuracy of the binding site is far more important than the accuracy of any other region of the protein. We have developed an automated method for assessing the quality of a protein model by focusing on the set of residues in the small molecule binding site. Small molecule binding sites typically involve multiple regions of the protein coming together in space, and their accuracy has been observed to be sensitive to even small alignment errors. In addition, ligand binding sites contain the critical information required for drug design, making their accuracy particularly important. We analyzed the accuracy of the binding sites on two sets of protein models: the predictions submitted by the top-performing CASP7 groups, and the models generated by four widely used homology modeling packages. The results of our CASP7 analysis significantly differ from the previous findings, implying that the binding site measure does not correlate with the traditional model quality measures used in the structure prediction benchmarks. For the modeling programs, the resolution of binding sites is extremely sensitive to the degree of sequence homology between the query and the template, even when the most accurate alignments are used in the homology modeling process.


2006 ◽  
Vol 398 (3) ◽  
pp. 393-398 ◽  
Author(s):  
Thomas Gossas ◽  
U. Helena Danielson

Matrix metallopeptidase-12 (MMP-12) binds three calcium ions and a zinc ion, in addition to the catalytic zinc ion. These ions are thought to have a structural role, stabilizing the active conformation of the enzyme. To characterize the importance of Ca2+ binding for MMP-12 activity and the properties of the different Ca2+ sites, the activity as a function of [Ca2+] and the effect of pH was investigated. The enzymatic activity was directly correlated to calcium binding and a Langmuir isotherm for three binding sites described the activity as a function of [Ca2+]. The affinities for two of the binding sites were quantified at several pH values. At pH 7.5, the KD was 0.1 mM for the high-affinity binding site, 5 mM for the intermediate-affinity binding site and >100 mM for the low-affinity binding site. For all three sites, the affinity for calcium decreased with reduced pH, in accordance with the loss of interactions upon protonation of the calcium-co-ordinating aspartate and glutamate carboxylates at acidic pH. The pKa values of the calcium binding sites with the highest and intermediate affinities were determined to be 4.3 and 6.5 respectively. Optimal pH for catalysis was above 7.5. The low-, intermediate- and high-affinity binding sites were assigned on the basis of analysis of three-dimensional-structures of MMP-12. The strong correlation between MMP-12 activity and calcium binding for the physiologically relevant [Ca2+] and pH ranges studied suggest that Ca2+ may be involved in controlling the activity of MMP-12.


2013 ◽  
Vol 368 (1632) ◽  
pp. 20130029 ◽  
Author(s):  
Harendra Guturu ◽  
Andrew C. Doxey ◽  
Aaron M. Wenger ◽  
Gill Bejerano

Mapping the DNA-binding preferences of transcription factor (TF) complexes is critical for deciphering the functions of cis -regulatory elements. Here, we developed a computational method that compares co-occurring motif spacings in conserved versus unconserved regions of the human genome to detect evolutionarily constrained binding sites of rigid TF complexes. Structural data were used to estimate TF complex physical plausibility, explore overlapping motif arrangements seldom tackled by non-structure-aware methods, and generate and analyse three-dimensional models of the predicted complexes bound to DNA. Using this approach, we predicted 422 physically realistic TF complex motifs at 18% false discovery rate, the majority of which (326, 77%) contain some sequence overlap between binding sites. The set of mostly novel complexes is enriched in known composite motifs, predictive of binding site configurations in TF–TF–DNA crystal structures, and supported by ChIP-seq datasets. Structural modelling revealed three cooperativity mechanisms: direct protein–protein interactions, potentially indirect interactions and ‘through-DNA’ interactions. Indeed, 38% of the predicted complexes were found to contain four or more bases in which TF pairs appear to synergize through overlapping binding to the same DNA base pairs in opposite grooves or strands. Our TF complex and associated binding site predictions are available as a web resource at http://bejerano.stanford.edu/complex .


2021 ◽  
Author(s):  
Vineeth Chelur ◽  
U. Deva Priyakumar

Protein-drug interactions play important roles in many biological processes and therapeutics. Prediction of the active binding site of a protein helps discover and optimise these interactions leading to the design of better ligand molecules. The tertiary structure of a protein determines the binding sites available to the drug molecule. A quick and accurate prediction of the binding site from sequence alone without utilising the three-dimensional structure is challenging. Deep Learning has been used in a variety of biochemical tasks and has been hugely successful. In this paper, a Residual Neural Network (leveraging skip connections) is implemented to predict a protein's most active binding site. An Annotated Database of Druggable Binding Sites from the Protein DataBank, sc-PDB, is used for training the network. Features extracted from the Multiple Sequence Alignments (MSAs) of the protein generated using DeepMSA, such as Position-Specific Scoring Matrix (PSSM), Secondary Structure (SS3), and Relative Solvent Accessibility (RSA), are provided as input to the network. A weighted binary cross-entropy loss function is used to counter the substantial imbalance in the two classes of binding and non-binding residues. The network performs very well on single-chain proteins, providing a pocket that has good interactions with a ligand.


Author(s):  
Igor Kozlovskii ◽  
Petr Popov

Identification of novel protein binding sites expands «druggable genome» and opens new opportunities for drug discovery. Generally, presence or absence of a binding site depends on the three-dimensional conformation of a protein, making binding site identification resemble to object detection problem in computer vision. Here we introduce a computational approach for the large-scale detection of protein binding sites, named BiteNet, that considers protein conformations as the 3D-images, binding sites as the objects on these images to detect, and conformational ensembles of proteins as the 3D-videos to analyze. BiteNet is suitable for spatiotemporal detection of hard-to-spot allosteric binding sites, as we showed for conformation-specific binding site of the epidermal growth factor receptor, oligomer-specific binding site of the ion channel, and binding sites in G protein-coupled receptors. BiteNet outperforms state-of-the-art methods both in terms of accuracy and speed, taking about 1.5 minute to analyze 1000 conformations of a protein with 2000 atoms. BiteNet is available at https://github.com/i-Molecule/bitenet.


2022 ◽  
Author(s):  
Adam Zemla ◽  
Jonathan E. Allen ◽  
Dan Kirshner ◽  
Felice C. Lightstone

We present a structure-based method for finding and evaluating structural similarities in protein regions relevant to ligand binding. PDBspheres comprises an exhaustive library of protein structure regions (spheres) adjacent to complexed ligands derived from the Protein Data Bank (PDB), along with methods to find and evaluate structural matches between a protein of interest and spheres in the library. Currently, PDBspheres library contains more than 2 million spheres, organized to facilitate searches by sequence and/or structure similarity of protein-ligand binding sites or interfaces between interacting molecules. PDBspheres uses the LGA structure alignment algorithm as the main engine for detecting structure similarities between the protein of interest and library spheres. An all-atom structure similarity metric ensures that sidechain placement is taken into account in the PDBspheres primary assessment of confidence in structural matches. In this paper, we (1) describe the PDBspheres method, (2) demonstrate how PDBspheres can be used to detect and characterize binding sites in protein structures, (3) compare PDBspheres use for binding site prediction with seven other binding site prediction methods using a curated dataset of 2,528 ligand-bound and ligand-free crystal structures, and (4) use PDBspheres to cluster pockets and assess structural similarities among protein binding sites of the 4,876 structures in the refined set of PDBbind 2019 dataset. The PDBspheres library is made publicly available for download at https://proteinmodel.org/AS2TS/PDBspheres


2021 ◽  
Author(s):  
Rishal Aggarwal ◽  
Akash Gupta ◽  
Vineeth Chelur ◽  
C. V. Jawahar ◽  
U. Deva Priyakumar

<div> A structure-based drug design pipeline involves the development of potential drug molecules or ligands that form stable complexes with a given receptor at its binding site. A prerequisite to this is finding druggable and functionally relevant binding sites on the 3D structure of the protein. Although several methods for detecting binding sites have been developed beforehand, a majority of them surprisingly fail in the identification and ranking of binding sites accurately. The rapid adoption and success of deep learning algorithms in various sections of structural biology beckons the usage of such algorithms for accurate binding site detection. As a combination of geometry based software and deep learning, we report a novel framework, DeepPocket that utilises 3D convolutional neural networks for the rescoring of pockets identified by Fpocket and further segments these identified cavities on the protein surface. Apart from this, we also propose another dataset SC6K containing protein structures submitted in the Protein Data Bank (PDB) from January 2018 till February 2020 for ligand binding site (LBS) detection. DeepPocket's results on various binding site datasets and SC6K highlights its better performance over current state-of-the-art methods and good generalization ability over novel structures. </div><div><br></div>


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Mingjian Jiang ◽  
Zhen Li ◽  
Yujie Bian ◽  
Zhiqiang Wei

Abstract Background Binding sites are the pockets of proteins that can bind drugs; the discovery of these pockets is a critical step in drug design. With the help of computers, protein pockets prediction can save manpower and financial resources. Results In this paper, a novel protein descriptor for the prediction of binding sites is proposed. Information on non-bonded interactions in the three-dimensional structure of a protein is captured by a combination of geometry-based and energy-based methods. Moreover, due to the rapid development of deep learning, all binding features are extracted to generate three-dimensional grids that are fed into a convolution neural network. Two datasets were introduced into the experiment. The sc-PDB dataset was used for descriptor extraction and binding site prediction, and the PDBbind dataset was used only for testing and verification of the generalization of the method. The comparison with previous methods shows that the proposed descriptor is effective in predicting the binding sites. Conclusions A new protein descriptor is proposed for the prediction of the drug binding sites of proteins. This method combines the three-dimensional structure of a protein and non-bonded interactions with small molecules to involve important factors influencing the formation of binding site. Analysis of the experiments indicates that the descriptor is robust for site prediction.


2010 ◽  
Vol 2010 ◽  
pp. 1-9 ◽  
Author(s):  
Adeel Malik ◽  
Ahmad Firoz ◽  
Vivekanand Jha ◽  
Shandar Ahmad

Understanding of the three-dimensional structures of proteins that interact with carbohydrates covalently (glycoproteins) as well as noncovalently (protein-carbohydrate complexes) is essential to many biological processes and plays a significant role in normal and disease-associated functions. It is important to have a central repository of knowledge available about these protein-carbohydrate complexes as well as preprocessed data of predicted structures. This can be significantly enhanced by tools de novo which can predict carbohydrate-binding sites for proteins in the absence of structure of experimentally known binding site. PROCARB is an open-access database comprising three independently working components, namely, (i) Core PROCARB module, consisting of three-dimensional structures of protein-carbohydrate complexes taken from Protein Data Bank (PDB), (ii) Homology Models module, consisting of manually developed three-dimensional models of N-linked and O-linked glycoproteins of unknown three-dimensional structure, and (iii) CBS-Pred prediction module, consisting of web servers to predict carbohydrate-binding sites using single sequence or server-generated PSSM. Several precomputed structural and functional properties of complexes are also included in the database for quick analysis. In particular, information about function, secondary structure, solvent accessibility, hydrogen bonds and literature reference, and so forth, is included. In addition, each protein in the database is mapped to Uniprot, Pfam, PDB, and so forth.


2004 ◽  
Vol 48 (6) ◽  
pp. 2214-2222 ◽  
Author(s):  
Michael Korsinczky ◽  
Katja Fischer ◽  
Nanhua Chen ◽  
Joanne Baker ◽  
Karl Rieckmann ◽  
...  

ABSTRACT Sulfadoxine is predominantly used in combination with pyrimethamine, commonly known as Fansidar, for the treatment of Plasmodium falciparum. This combination is usually less effective against Plasmodium vivax, probably due to the innate refractoriness of parasites to the sulfadoxine component. To investigate this mechanism of resistance by P. vivax to sulfadoxine, we cloned and sequenced the P. vivax dhps (pvdhps) gene. The protein sequence was determined, and three-dimensional homology models of dihydropteroate synthase (DHPS) from P. vivax as well as P. falciparum were created. The docking of sulfadoxine to the two DHPS models allowed us to compare contact residues in the putative sulfadoxine-binding site in both species. The predicted sulfadoxine-binding sites between the species differ by one residue, V585 in P. vivax, equivalent to A613 in P. falciparum. V585 in P. vivax is predicted by energy minimization to cause a reduction in binding of sulfadoxine to DHPS in P. vivax compared to P. falciparum. Sequencing dhps genes from a limited set of geographically different P. vivax isolates revealed that V585 was present in all of the samples, suggesting that V585 may be responsible for innate resistance of P. vivax to sulfadoxine. Additionally, amino acid mutations were observed in some P. vivax isolates in positions known to cause resistance in P. falciparum, suggesting that, as in P. falciparum, these mutations are responsible for acquired increases in resistance of P. vivax to sulfadoxine.


Sign in / Sign up

Export Citation Format

Share Document