scholarly journals A database of calculated solution parameters for the recently released AlphaFold predicted protein structures

Author(s):  
Emre Brookes ◽  
Mattia Rocco

Abstract Recent spectacular advances by AI programs in 3D structure predictions from protein sequences have revolutionized the field in terms of accuracy and speed. The resulting "folding frenzy" has already produced predicted protein structure databases for the entire human and other organisms' proteomes. However, rapidly ascertaining a predicted structure's reliability based on measured properties in solution should be considered. Shape-sensitive hydrodynamic parameters such as the diffusion and sedimentation coefficients (D0t(20,w),s0(20,w)) and the intrinsic viscosity ([η]) can provide a rapid assessment of the overall structure likeliness, and SAXS would yield the structure-related pair-wise distance distribution function p(r) vs. r. Using the extensively validated UltraScan SOlution MOdeler (US-SOMO) suite we have calculated from the AlphaFold structures the corresponding D0t(20,w), s0(20,w), [η], p(r) vs. r, and other parameters. Circular dichroism spectra were also computed. The resulting US-SOMO-AF database should aid in rapidly evaluating the consistency in solution of AlphaFold predicted protein structures.

2017 ◽  
Vol 19 (48) ◽  
pp. 32381-32388 ◽  
Author(s):  
Anna G. Matveeva ◽  
Vyacheslav M. Nekrasov ◽  
Alexander G. Maryasov

The model-free approach used does not introduce systematic distortions in the computed distance distribution function between two spins and appears to result in noise grouping in the short distance range.


1998 ◽  
Vol 54 (6) ◽  
pp. 1085-1094 ◽  
Author(s):  
Helge Weissig ◽  
Ilya N. Shindyalov ◽  
Philip E. Bourne

Databases containing macromolecular structure data provide a crystallographer with important tools for use in solving, refining and understanding the functional significance of their protein structures. Given this importance, this paper briefly summarizes past progress by outlining the features of the significant number of relevant databases developed to date. One recent database, PDB+, containing all current and obsolete structures deposited with the Protein Data Bank (PDB) is discussed in more detail. PDB+ has been used to analyze the self-consistency of the current (1 January 1998) corpus of over 7000 structures. A summary of those findings is presented (a full discussion will appear elsewhere) in the form of global and temporal trends within the data. These trends indicate that challenges exist if crystallographers are to provide the community with complete and consistent structural results in the future. It is argued that better information management practices are required to meet these challenges.


Author(s):  
Zhenlu Li ◽  
Matthias Buck

Of 20,000 or so canonical human protein sequences, as of July 2020, 6,747 proteins have had their full or partial medium to high-resolution structures determined by x-ray crystallography or other methods. Which of these proteins dominate the protein database (the PDB) and why? In this paper, we list the 272 top protein structures based on the number of their PDB depositions. This set of proteins accounts for more than 40% of all available human PDB entries and represent past trend and current status for protein science. We briefly discuss the relationship which some of the prominent protein structures have with protein biophysics research and mention their relevance to human diseases. The information may inspire researchers who are new to protein science, but it also provides a year 2020 snap-shot for the state of protein science.


2021 ◽  
Vol 7 ◽  
Author(s):  
Castrense Savojardo ◽  
Matteo Manfredi ◽  
Pier Luigi Martelli ◽  
Rita Casadio

Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to which extent solvent exposure of residues can be associated to the pathogenicity of the variation. By this, SASA of the wild-type residue acquires a role in the context of functional annotation of protein single-residue variations (SRVs). By mapping variations on a curated database of human protein structures, we found that residues targeted by disease related SRVs are less accessible to solvent than residues involved in polymorphisms. The disease association is not evenly distributed among the different residue types: SRVs targeting glycine, tryptophan, tyrosine, and cysteine are more frequently disease associated than others. For all residues, the proportion of disease related SRVs largely increases when the wild-type residue is buried and decreases when it is exposed. The extent of the increase depends on the residue type. With the aid of an in house developed predictor, based on a deep learning procedure and performing at the state-of-the-art, we are able to confirm the above tendency by analyzing a large data set of residues subjected to variations and occurring in some 12,494 human protein sequences still lacking three-dimensional structure (derived from HUMSAVAR). Our data support the notion that surface accessible area is a distinguished property of residues that undergo variation and that pathogenicity is more frequently associated to the buried property than to the exposed one.


Parasitology ◽  
2013 ◽  
Vol 141 (2) ◽  
pp. 241-253 ◽  
Author(s):  
IVONE De ANDRADE ROSA ◽  
MARJOLLY BRIGIDO CARUSO ◽  
SILAS PESSINI RODRIGUES ◽  
REINALDO BARROS GERALDO ◽  
LUIZA WILGES KIST ◽  
...  

SUMMARYTritrichomonas foetusis a protist that causes bovine trichomoniasis and presents a well-developed Golgi. There are very few studies concerning the Golgi in trichomonads. In this work, monoclonal antibodies were raised against Golgi ofT. foetusand used as a tool on morphologic and biochemical studies of this organelle. Among the antibodies produced, one was named mAb anti-Golgi 20.3, which recognized specifically the Golgi complex by fluorescence and electron microscopy. By immunoblotting this antibody recognized two proteins with 60 and 66 kDa that were identified as putative beta-tubulin and adenosine triphosphatase, respectively. The mAb 20.3 also recognized the Golgi complex of theTrichomonas vaginalis, a human parasite. In addition, the nucleotide coding sequences of these proteins were identified and included in theT. foetusdatabase, and the 3D structure of the proteins was predicted. In conclusion, this study indicated: (1) adenosine triphosphatase is present in the Golgi, (2) ATPase is conserved betweenT. foetusandT. vaginalis, (3) there is new information concerning the nucleic acid sequences and protein structures of adenosine triphosphatase and beta-tubulin fromT. foetusand (4) the mAb anti-Golgi 20.3 is a good Golgi marker and can be used in future studies.


2005 ◽  
Vol 277-279 ◽  
pp. 272-277
Author(s):  
Sung Hee Park ◽  
Keun Ho Ryu

The problem of comparison of structural similarity has been complex and computationally expensive. The first step to solve comparison of structural similarity in 3D structure databases is to develop fast methods for structural similarity. Therefore, we propose a new method of comparing structural similarity in protein structure databases by using topological patterns of proteins. In our approach, the geometry of secondary structure elements in 3D space is represented by spatial data types and is indexed using Rtrees. Topological patterns are discovered by spatial topology relations based on the Rtree index join. An algorithm for a similarity search compares topological patterns of a query protein with those of proteins in structure databases by the intersection frequency of SSEs. Our experimental results show that the execution time of our method is three times faster than the generally known method DALITE. Our method can generate small candidate sets for more accurate alignment tools such as DALI and SSAP.


2019 ◽  
Vol 21 (1) ◽  
pp. 213
Author(s):  
Federico Norbiato ◽  
Flavio Seno ◽  
Antonio Trovato ◽  
Marco Baiesi

Many native structures of proteins accomodate complex topological motifs such as knots, lassos, and other geometrical entanglements. How proteins can fold quickly even in the presence of such topological obstacles is a debated question in structural biology. Recently, the hypothesis that energetic frustration might be a mechanism to avoid topological frustration has been put forward based on the empirical observation that loops involved in entanglements are stabilized by weak interactions between amino-acids at their extrema. To verify this idea, we use a toy lattice model for the folding of proteins into two almost identical structures, one entangled and one not. As expected, the folding time is longer when random sequences folds into the entangled structure. This holds also under an evolutionary pressure simulated by optimizing the folding time. It turns out that optmized protein sequences in the entangled structure are in fact characterized by frustrated interactions at the closures of entangled loops. This phenomenon is much less enhanced in the control case where the entanglement is not present. Our findings, which are in agreement with experimental observations, corroborate the idea that an evolutionary pressure shapes the folding funnel to avoid topological and kinetic traps.


Sign in / Sign up

Export Citation Format

Share Document