Validation of carbohydrate structures: not just nomenclature

2014 ◽  
Vol 70 (a1) ◽  
pp. C1481-C1481
Author(s):  
Jon Agirre ◽  
Kevin Cowtan

Despite the key implications carbohydrates have in a multitude of pathological processes, a large number of the sugar-containing structures deposited into the Protein Data Bank (PDB) show nomenclature errors [1] that persist even after the remediation of the PDB archive [2]. Here we present the results from a systematic study of the conformation and ring distortion of cyclic carbohydrate models for which structure factors have been deposited into the PDB. These models have also been scored using a real-space correlation coefficient calculated between model and experimental electron density. The results have enabled us to produce a database of well-refined carbohydrate structures for use in the framework of an automated sugar-detecting software, to be announced shortly.

2019 ◽  
Author(s):  
Sen Yao ◽  
Hunter N.B. Moseley

AbstractHigh-quality three-dimensional structural data is of great value for the functional interpretation of biomacromolecules, especially proteins; however, structural quality varies greatly across the entries in the worldwide Protein Data Bank (wwPDB). Since 2008, the wwPDB has required the inclusion of structure factors with the deposition of x-ray crystallographic structures to support the independent evaluation of structures with respect to the underlying experimental data used to derive those structures. However, interpreting the discrepancies between the structural model and its underlying electron density data is difficult, since derived electron density maps use arbitrary electron density units which are inconsistent between maps from different wwPDB entries. Therefore, we have developed a method that converts electron density values into units of electrons. With this conversion, we have developed new methods that can evaluate specific regions of an x-ray crystallographic structure with respect to a physicochemical interpretation of its corresponding electron density map. We have systematically compared all deposited x-ray crystallographic protein models in the wwPDB with their underlying electron density maps, if available, and characterized the electron density in terms of expected numbers of electrons based on the structural model. The methods generated coherent evaluation metrics throughout all PDB entries with associated electron density data, which are consistent with visualization software that would normally be used for manual quality assessment. To our knowledge, this is the first attempt to derive units of electrons directly from electron density maps without the aid of the underlying structure factors. These new metrics are biochemically-informative and can be extremely useful for filtering out low-quality structural regions from inclusion into systematic analyses that span large numbers of PDB entries. Furthermore, these new metrics will improve the ability of non-crystallographers to evaluate regions of interest within PDB entries, since only the PDB structure and the associated electron density maps are needed. These new methods are available as a well-documented Python package on GitHub and the Python Package Index under a modified Clear BSD open source license.Author summaryElectron density maps are very useful for validating the x-ray structure models in the Protein Data Bank (PDB). However, it is often daunting for non-crystallographers to use electron density maps, as it requires a lot of prior knowledge. This study provides methods that can infer chemical information solely from the electron density maps available from the PDB to interpret the electron density and electron density discrepancy values in terms of units of electrons. It also provides methods to evaluate regions of interest in terms of the number of missing or excessing electrons, so that a broader audience, such as biologists or bioinformaticians, can also make better use of the electron density information available in the PDB, especially for quality control purposes.Software and full results available athttps://github.com/MoseleyBioinformaticsLab/pdb_eda (software on GitHub)https://pypi.org/project/pdb-eda/ (software on PyPI)https://pdb-eda.readthedocs.io/en/latest/ (documentation on ReadTheDocs)https://doi.org/10.6084/m9.figshare.7994294 (code and results on FigShare)


2019 ◽  
Author(s):  
Dmytro Guzenko ◽  
Stephen K. Burley ◽  
Jose M. Duarte

AbstractDetection of protein structure similarity is a central challenge in structural bioinformatics. Comparisons are usually performed at the polypeptide chain level, however the functional form of a protein within the cell is often an oligomer. This fact, together with recent growth of oligomeric structures in the Protein Data Bank (PDB), demands more efficient approaches to oligomeric assembly alignment/retrieval. Traditional methods use atom level information, which can be complicated by the presence of topological permutations within a polypeptide chain and/or subunit rearrangements. These challenges can be overcome by comparing electron density volumes directly. But, brute force alignment of 3D data is a compute intensive search problem. We developed a 3D Zernike moment normalization procedure to orient electron density volumes and assess similarity with unprecedented speed. Similarity searching with this approach enables real-time retrieval of proteins/protein assemblies resembling a target, from PDB or user input, together with resulting alignments (http://shape.rcsb.org).Author SummaryProtein structures possess wildly varied shapes, but patterns at different levels are frequently reused by nature. Finding and classifying these similarities is fundamental to understand evolution. Given the continued growth in the number of known protein structures in the Protein Data Bank, the task of comparing them to find the common patterns is becoming increasingly complicated. This is especially true when considering complete protein assemblies with several polypeptide chains, where the large sizes further complicate the issue. Here we present a novel method that can detect similarity between protein shapes and that works equally fast for any size of proteins or assemblies. The method looks at proteins as volumes of density distribution, departing from what is more usual in the field: similarity assessment based on atomic coordinates and chain connectivity. A volumetric function is amenable to be decomposed with a mathematical tool known as 3D Zernike polynomials, resulting in a compact description as vectors of Zernike moments. The tool was introduced in the 1990s, when it was suggested that the moments could be normalized to be invariant to rotations without losing information. Here we demonstrate that in fact this normalization is possible and that it offers a much more accurate method for assessing similarity between shapes, when compared to previous attempts.


2012 ◽  
Vol 68 (4) ◽  
pp. 454-467 ◽  
Author(s):  
Ian J. Tickle

The commonly used validation metrics for the local agreement of a structure model with the observed electron density, namely the real-space R (RSR) and the real-space correlation coefficient (RSCC), are reviewed. It is argued that the primary goal of all validation techniques is to verify the accuracy of the model, since precision is an inherent property of the crystal and the data. It is demonstrated that the principal weakness of both of the above metrics is their inability to distinguish the accuracy of the model from its precision. Furthermore, neither of these metrics in their usual implementation indicate the statistical significance of the result. The statistical properties of electron-density maps are reviewed and an improved alternative likelihood-based metric is suggested. This leads naturally to a χ2 significance test of the difference density using the real-space difference density Z score (RSZD). This is a metric purely of the local model accuracy, as required for effective model validation and structure optimization by practising crystallographers prior to submission of a structure model to the PDB. A new real-space observed density Z score (RSZO) is also proposed; this is a metric purely of the model precision, as a substitute for other precision metrics such as the B factor.


2002 ◽  
Vol 58 (4) ◽  
pp. 632-639 ◽  
Author(s):  
Vladimir G. Tsirelson

It is demonstrated that the approximate kinetic energy density calculated using the second-order gradient expansion with parameters of the multipole model fitted to experimental structure factors reproduces the main features of this quantity in a molecular or crystal position space. The use of the local virial theorem provides an appropriate derivation of approximate potential energy density and electronic energy density from the experimental (model) electron density and its derivatives. Consideration of these functions is not restricted by the critical points in the electron density and provides a comprehensive characterization of bonding in molecules and crystals.


2008 ◽  
Vol 73 (5) ◽  
pp. 608-615 ◽  
Author(s):  
Petr Kolenko ◽  
Tereza Skálová ◽  
Jan Dohnálek ◽  
Jindřich Hašek

Glycosylation of IgG-Fc plays an important role in the activation of the immune system response. Effector functions are modulated by different degrees of deglycosylation of IgG-Fc. However, the geometry of oligosaccharides covalently bound to IgG-Fc does not seem to be in good agreement with electron density in most of the structures deposited in the Protein Data Bank. Our study of correlation between the oligosaccharide geometry, connectivity, and electron density shows several discrepancies, mainly for L-fucose. Revision of refinement of two structures containing the Fc-fragment solved at the highest resolution brings clear evidence for α-L-fucosylation instead of β-L-fucosylation as it was claimed in most of the deposited structures in the Protein Data Bank containing the Fc-fragment, and also in the original structures selected for re-refinement. Our revision refinement results in a decrease in R factors, better agreement with electron density, meaningful contacts, and acceptable geometry of L-fucose.


2008 ◽  
Vol 41 (3) ◽  
pp. 659-659 ◽  
Author(s):  
Luca Jovine ◽  
Ekaterina Morgunova ◽  
Rudolf Ladenstein

It is suggested that it would be useful if raw X-ray diffraction images could be included in data depositions with the Protein Data Bank.


2009 ◽  
Vol 65 (6) ◽  
pp. 715-723 ◽  
Author(s):  
Jacob Overgaard ◽  
Jamie A. Platts ◽  
Bo B. Iversen

Details of the complex bonding environment present in the molecular centre of an alkyne-bridged dicobalt complex have been examined using a combination of experimental and theoretical charge-density modelling for two compounds which share a central Co2C2 tetrahedral moiety as their common motif. Topological analysis of the experimental electron density illustrates the problem of separating the Co—C bond-critical points (b.c.p.s) from the intervening ring-critical point (r.c.p.), due largely to the flat nature of the electron density in the CoC2 triangles. Such a separation of critical points is immediately obtained from a topological analysis of the theoretical electron density as well as from the multipole-projected theoretical density; however, the addition of random noise to the theoretical structure factors prior to multipole modelling leads to a failure in consistently distinguishing two b.c.p.s and one r.c.p. in such close proximity within the particular environment of this Co2C2 centre.


2019 ◽  
Author(s):  
Sen Yao ◽  
Hunter N.B. Moseley

AbstractAs the number of macromolecular structures in the worldwide Protein Data Bank (wwPDB) continues to grow rapidly, more attention is being paid to the quality of its data, especially for use in aggregated structural and dynamics analyses. In this study, we systematically analyzed 3.5 Å regions around all metal ions across all PDB entries with supporting electron density maps available from the PDB in Europe. All resulting metal ion-centric regions were evaluated with respect to four quality-control criteria involving electron density resolution, atom occupancy, symmetry atom exclusion, and regional electron density discrepancy. The resulting list of metal binding sites passing all four criteria possess high regional structural quality and should be beneficial to a wide variety of downstream analyses. This study demonstrates an approach for the pan-PDB evaluation of metal binding site structural quality with respect to underlying x-ray crystallographic experimental data represented in available electron density maps of proteins. For non-crystallographers in particular, we hope to change the focus and discussion of structural quality from a global evaluation to a regional evaluation, since all structural entries in the wwPDB appear to have both regions of high and low structural quality.


Sign in / Sign up

Export Citation Format

Share Document