scholarly journals Tripeptide loop closure: a detailed study of reconstructions based on Ramachandran distributions

2021 ◽  
Author(s):  
Timothee O'Donnell ◽  
Charles H. Robert ◽  
Frederic Cazals

Tripeptide loop closure (TLC) is a standard procedure to reconstruct protein backbone conformations, by solving a zero dimensional polynomial system yielding up to 16 solutions. In this work, we first show that multiprecision is required in a TLC solver to guarantee the existence and the accuracy of solutions. We then compare solutions yielded by the TLC solver against tripeptides from the Protein Data Bank. We show that these solutions are geometrically diverse (up to 3 Angstroms RMSD with respect to the data), and sound in terms of potential energy. Finally, we compare Ramachandran distributions of data and reconstructions for the three amino acids. The distribution of reconstructions in the second angular space (φ2 , ψ2) stands out, with a rather uniform distribution leaving a central void. We anticipate that these insights, coupled to our robust implementation in the Structural Bioinformatics Library (https://sbl.inria.fr/doc/Tripeptide_loop_closure-user-manual.html), will boost the interest of TLC for structural modeling in general, and the generation of conformations of flexible loops in particular.

Molecules ◽  
2020 ◽  
Vol 25 (7) ◽  
pp. 1522 ◽  
Author(s):  
Mikhail Yu. Lobanov ◽  
Ilya V. Likhachev ◽  
Oxana V. Galzitskaya

We created a new library of disordered patterns and disordered residues in the Protein Data Bank (PDB). To obtain such datasets, we clustered the PDB and obtained the groups of chains with different identities and marked disordered residues. We elaborated a new procedure for finding disordered patterns and created a new version of the library. This library includes three sets of patterns: unique patterns, patterns consisting of two kinds of amino acids, and homo-repeats. Using this database, the user can: (1) find homologues in the entire Protein Data Bank; (2) perform a statistical analysis of disordered residues in protein structures; (3) search for disordered patterns and homo-repeats; (4) search for disordered regions in different chains of the same protein; (5) download clusters of protein chains with different identity from our database and library of disordered patterns; and (6) observe 3D structure interactively using MView. A new library of disordered patterns will help improve the accuracy of predictions for residues that will be structured or unstructured in a given region.


2019 ◽  
Author(s):  
Sean M. Cascarina ◽  
Mikaela R. Elder ◽  
Eric D. Ross

AbstractA variety of studies have suggested that low-complexity domains (LCDs) tend to be intrinsically disordered and are relatively rare within structured proteins in the protein data bank (PDB). Although LCDs are often treated as a single class, we previously found that LCDs enriched in different amino acids can exhibit substantial differences in protein metabolism and function. Therefore, we wondered whether the structural conformations of LCDs are likewise dependent on which specific amino acids are enriched within each LCD. Here, we directly examined relationships between enrichment of individual amino acids and secondary structure preferences across the entire PDB proteome. Secondary structure preferences varied as a function of the identity of the amino acid enriched and its degree of enrichment. Furthermore, divergence in secondary structure profiles often occurred for LCDs enriched in physicochemically similar amino acids (e.g. valine vs. leucine), indicating that LCDs composed of related amino acids can have distinct secondary structure preferences. Comparison of LCD secondary structure preferences with numerous pre-existing secondary structure propensity scales resulted in relatively poor correlations for certain types of LCDs, indicating that these scales may not capture secondary structure preferences as sequence complexity decreases. Collectively, these observations provide a highly resolved view of structural preferences among LCDs parsed by the nature and magnitude of single amino acid enrichment.Author SummaryThe structures that proteins adopt are directly related to their amino acid sequences. Low-complexity domains (LCDs) in protein sequences are unusual regions made up of only a few different types of amino acids. Although this is the key feature that classifies sequences as LCDs, the physical properties of LCDs will differ based on the types of amino acids that are found in each domain. For example, the sequences “AAAAAAAAAA”, “EEEEEEEEEE”, and “EEKRKEEEKE” will have very different properties, even though they would all be classified as LCDs by traditional methods. In a previous study, we developed a new method to further divide LCDs into categories that more closely reflect the differences in their physical properties. In this study, we apply that approach to examine the structures of LCDs when sorted into different categories based on their amino acids. This allowed us to define relationships between the types of amino acids in the LCDs and their corresponding structures. Since protein structure is closely related to protein function, this has important implications for understanding the basic functions and properties of LCDs in a variety of proteins.


Author(s):  
Luciano Andres Abriata

Protein X-ray structures with non-corrin cobalt(II)-containing sites, either natural or substituting another native ion, were downloaded from the Protein Data Bank and explored to (i) describe which amino acids are involved in their first ligand shells and (ii) analyze cobalt(II)–donor bond lengths in comparison with previously reported target distances, CSD data and EXAFS data. The set of amino acids involved in CoIIbinding is similar to that observed for catalytic ZnIIsites,i.e.with a large fraction of carboxylate O atoms from aspartate and glutamate and aromatic N atoms from histidine. The computed CoII–donor bond lengths were found to depend strongly on structure resolution, an artifact previously detected for other metal–donor distances. Small corrections are suggested for the target bond lengths to the aromatic N atoms of histidines and the O atoms of water and hydroxide. The available target distance for cysteine (Scys) is confirmed; those for backbone O and other donors remain uncertain and should be handled with caution in refinement and modeling protocols. Finally, a relationship between both CoII—O bond lengths in bidentate carboxylates is quantified.


2002 ◽  
Vol 58 (s1) ◽  
pp. c214-c214
Author(s):  
W. F. Bluhm ◽  
T. Battistuz ◽  
E. Clingman ◽  
N. Deshpande ◽  
W. Fleri ◽  
...  

2021 ◽  
pp. 166900
Author(s):  
Alexander Miguel Monzon ◽  
Paolo Bonato ◽  
Marco Necci ◽  
Silvio C.E. Tosatto ◽  
Damiano Piovesan
Keyword(s):  

2016 ◽  
Vol 72 (10) ◽  
pp. 1110-1118 ◽  
Author(s):  
Wouter G. Touw ◽  
Bart van Beusekom ◽  
Jochem M. G. Evers ◽  
Gert Vriend ◽  
Robbie P. Joosten

Many crystal structures in the Protein Data Bank contain zinc ions in a geometrically distorted tetrahedral complex with four Cys and/or His ligands. A method is presented to automatically validate and correct these zinc complexes. Analysis of the corrected zinc complexes shows that the average Zn–Cys distances and Cys–Zn–Cys angles are a function of the number of cysteines and histidines involved. The observed trends can be used to develop more context-sensitive targets for model validation and refinement.


2018 ◽  
Vol 47 (D1) ◽  
pp. D520-D528 ◽  
Author(s):  
◽  
Stephen K Burley ◽  
Helen M Berman ◽  
Charmi Bhikadiya ◽  
Chunxiao Bi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document