A mosaic bulk-solvent model improves density maps and the fit between model and data

Bulk solvent is a major component of bio-macromolecular crystals and therefore contributes significantly to diffraction intensities. Accurate modeling of the bulk-solvent region has been recognized as important for many crystallographic calculations, from computing of R-factors and density maps to model building and refinement. Owing to its simplicity and computational and modeling power, the flat (mask-based) bulk-solvent model introduced by Jiang & Brunger (1994) is used by most modern crystallographic software packages to account for disordered solvent. In this manuscript we describe further developments of the mask-based model that improves the fit between the model and the data and aids in map interpretation. The new algorithm, here referred to as mosaic bulk-solvent model, considers solvent variation across the unit cell. The mosaic model is implemented in the computational crystallography toolbox and can be used in Phenix in most contexts where accounting for bulk-solvent is required. It has been optimized and validated using a sufficiently large subset of the Protein Data Bank entries that have crystallographic data available.

Download Full-text

Use of Patterson-based methods automatically to determine the structures of heavy-atom-containing proteins with up to 6000 non-hydrogen atoms in the asymmetric unit

Journal of Applied Crystallography ◽

10.1107/s0021889806028548 ◽

2006 ◽

Vol 39 (5) ◽

pp. 728-734 ◽

Cited By ~ 6

Author(s):

Maria Cristina Burla ◽

Rocco Caliandro ◽

Benedetta Carrozzini ◽

Giovanni Luca Cascarano ◽

Liberato De Caro ◽

...

Keyword(s):

Crystal Structures ◽

Protein Data Bank ◽

Heavy Atom ◽

Data Bank ◽

Atomic Resolution ◽

Asymmetric Unit ◽

Hydrogen Atoms ◽

Heavy Atoms ◽

Density Maps ◽

Resolution Data

The Patterson superposition methods described by Burlaet al.[J. Appl. Cryst.(2006),39, 527–535], based on the use of the `multiple implication functions', have been enriched by supplementary filtering techniques based on some general (resolution-dependent) features of both the Patterson and the electron density maps. The method has been implemented in a modified version of the programSIR2004and tested using a set of 20 crystal structures selected from the Protein Data Bank, having a number of non-hydrogen atoms in the asymmetric unit larger than 2000, atomic resolution data and some heavy atoms (equal to or heavier than Ca). The new phasing procedure is able to solve most of the test structures, among which there are two proteins with more than 6000 non-hydrogen atoms in the asymmetric unit, so extending by far the complexity today commonly considered as the limit for Patterson-based methods (i.e.about 2000 non-hydrogen atoms).

Download Full-text

A molecular graphics suite of programs for a microcomputer to display molecules from cambridge crystallographic data files and the alpha-carbon backbone of proteins from protein data bank crystal files

Computers & Chemistry ◽

10.1016/0097-8485(88)85007-1 ◽

1988 ◽

Vol 12 (1) ◽

pp. 65-82 ◽

Cited By ~ 4

Author(s):

Brian Clarke

Keyword(s):

Protein Data Bank ◽

Data Bank ◽

Crystallographic Data ◽

Molecular Graphics ◽

Carbon Backbone ◽

Data Files ◽

Alpha Carbon

Download Full-text

A chemical interpretation of protein electron density maps in the worldwide protein data bank

10.1101/613109 ◽

2019 ◽

Cited By ~ 3

Author(s):

Sen Yao ◽

Hunter N.B. Moseley

Keyword(s):

Protein Data Bank ◽

Electron Density ◽

Structural Model ◽

Data Bank ◽

X Ray ◽

New Methods ◽

Link Type ◽

Density Maps ◽

Structure Factors ◽

Python Package

AbstractHigh-quality three-dimensional structural data is of great value for the functional interpretation of biomacromolecules, especially proteins; however, structural quality varies greatly across the entries in the worldwide Protein Data Bank (wwPDB). Since 2008, the wwPDB has required the inclusion of structure factors with the deposition of x-ray crystallographic structures to support the independent evaluation of structures with respect to the underlying experimental data used to derive those structures. However, interpreting the discrepancies between the structural model and its underlying electron density data is difficult, since derived electron density maps use arbitrary electron density units which are inconsistent between maps from different wwPDB entries. Therefore, we have developed a method that converts electron density values into units of electrons. With this conversion, we have developed new methods that can evaluate specific regions of an x-ray crystallographic structure with respect to a physicochemical interpretation of its corresponding electron density map. We have systematically compared all deposited x-ray crystallographic protein models in the wwPDB with their underlying electron density maps, if available, and characterized the electron density in terms of expected numbers of electrons based on the structural model. The methods generated coherent evaluation metrics throughout all PDB entries with associated electron density data, which are consistent with visualization software that would normally be used for manual quality assessment. To our knowledge, this is the first attempt to derive units of electrons directly from electron density maps without the aid of the underlying structure factors. These new metrics are biochemically-informative and can be extremely useful for filtering out low-quality structural regions from inclusion into systematic analyses that span large numbers of PDB entries. Furthermore, these new metrics will improve the ability of non-crystallographers to evaluate regions of interest within PDB entries, since only the PDB structure and the associated electron density maps are needed. These new methods are available as a well-documented Python package on GitHub and the Python Package Index under a modified Clear BSD open source license.Author summaryElectron density maps are very useful for validating the x-ray structure models in the Protein Data Bank (PDB). However, it is often daunting for non-crystallographers to use electron density maps, as it requires a lot of prior knowledge. This study provides methods that can infer chemical information solely from the electron density maps available from the PDB to interpret the electron density and electron density discrepancy values in terms of units of electrons. It also provides methods to evaluate regions of interest in terms of the number of missing or excessing electrons, so that a broader audience, such as biologists or bioinformaticians, can also make better use of the electron density information available in the PDB, especially for quality control purposes.Software and full results available athttps://github.com/MoseleyBioinformaticsLab/pdb_eda (software on GitHub)https://pypi.org/project/pdb-eda/ (software on PyPI)https://pdb-eda.readthedocs.io/en/latest/ (documentation on ReadTheDocs)https://doi.org/10.6084/m9.figshare.7994294 (code and results on FigShare)

Download Full-text

Homology-based loop modeling yields more complete crystallographic protein structures

IUCrJ ◽

10.1107/s2052252518010552 ◽

2018 ◽

Vol 5 (5) ◽

pp. 585-594 ◽

Cited By ~ 14

Author(s):

Bart van Beusekom ◽

Krista Joosten ◽

Maarten L. Hekkelman ◽

Robbie P. Joosten ◽

Anastassis Perrakis

Keyword(s):

Protein Function ◽

Model Building ◽

Protein Structures ◽

Structural Models ◽

Data Bank ◽

Loop Modeling ◽

X Ray ◽

Density Maps ◽

Complete Protein ◽

Automated Procedures

Inherent protein flexibility, poor or low-resolution diffraction data or poorly defined electron-density maps often inhibit the building of complete structural models during X-ray structure determination. However, recent advances in crystallographic refinement and model building often allow completion of previously missing parts. This paper presents algorithms that identify regions missing in a certain model but present in homologous structures in the Protein Data Bank (PDB), and `graft' these regions of interest. These new regions are refined and validated in a fully automated procedure. Including these developments in the PDB-REDO pipeline has enabled the building of 24 962 missing loops in the PDB. The models and the automated procedures are publicly available through the PDB-REDO databank and webserver. More complete protein structure models enable a higher quality public archive but also a better understanding of protein function, better comparison between homologous structures and more complete data mining in structural bioinformatics projects.

Download Full-text

Bond-valence analyses of the crystal structures of FeMo/V cofactors in FeMo/V proteins

Acta Crystallographica Section D Structural Biology ◽

10.1107/s2059798320003952 ◽

2020 ◽

Vol 76 (5) ◽

pp. 428-437 ◽

Cited By ~ 1

Author(s):

Wan-Ting Jin ◽

Min Yang ◽

Shuang-Shuang Zhu ◽

Zhao-Hui Zhou

Keyword(s):

Error Analysis ◽

Crystal Structures ◽

Protein Data Bank ◽

Data Bank ◽

Bond Valence ◽

Crystallographic Data ◽

Data Sets ◽

Electron Configuration ◽

The Individual ◽

Bond Valence Method

The bond-valence method has been used for valence calculations of FeMo/V cofactors in FeMo/V proteins using 51 crystallographic data sets of FeMo/V proteins from the Protein Data Bank. The calculations show molybdenum(III) to be present in MoFe7S9C(Cys)(HHis)[R-(H)homocit] (where H4homocit is homocitric acid, HCys is cysteine and HHis is histidine) in FeMo cofactors, while vanadium(III) with a more reduced iron complement is obtained for FeV cofactors. Using an error analysis of the calculated valences, it was found that in FeMo cofactors Fe1, Fe6 and Fe7 can be unambiguously assigned as iron(III), while Fe2, Fe3, Fe4 and Fe5 show different degrees of mixed valences for the individual Fe atoms. For the FeV cofactors in PDB entry 5n6y, Fe4, Fe5 and Fe6 correspond to iron(II), iron(II) and iron(III), respectively, while Fe1, Fe2, Fe3 and Fe7 exhibit strongly mixed valences. Special situations such as CO-bound and selenium-substituted FeMo cofactors and O(N)H-bridged FeV cofactors are also discussed and suggest rearrangement of the electron configuration on the substitution of the bridging S atoms.

Download Full-text

Improved low-resolution crystallographic refinement with Phenix and Rosetta

Acta Crystallographica Section A Foundations and Advances ◽

10.1107/s2053273314092171 ◽

2014 ◽

Vol 70 (a1) ◽

pp. C782-C782

Author(s):

Nathaniel Echols ◽

Frank DiMaio ◽

Jeffrey Headd ◽

Thomas Terwilliger ◽

David Baker ◽

...

Keyword(s):

Structure Prediction ◽

Model Building ◽

New Method ◽

Low Resolution ◽

Final Model ◽

Software Packages ◽

Advanced Sampling ◽

Model Geometry ◽

Improved Model ◽

R Factors

Refinement of macromolecular structures against low-resolution crystallographic data is limited by the ability of current methods to arrive at a high-quality structure with realistic geometry. We have developed a new method for crystallographic refinement which combines the Rosetta sampling methodology and all atom energy function with likelihood-based reciprocal space refinement in Phenix, and find, on a test set of difficult low-resolution refinement cases, that models refined with the new method have significantly improved model geometry, and in most cases, lower free R factors and RMS deviation to the final model. Integration of the software packages additionally makes advanced sampling methods used in structure prediction and design available for crystallographic refinement and model-building, and also provides a strategy for improving the Rosetta force field for better agreement with experimental data.

Download Full-text

Making glycoproteins a little bit sweeter withPDB-REDO

Acta Crystallographica Section F Structural Biology Communications ◽

10.1107/s2053230x18004016 ◽

2018 ◽

Vol 74 (8) ◽

pp. 463-472 ◽

Cited By ~ 8

Author(s):

Bart van Beusekom ◽

Thomas Lütteke ◽

Robbie P. Joosten

Keyword(s):

Experimental Data ◽

Amino Acid ◽

Protein Data Bank ◽

Model Building ◽

Data Bank ◽

Structure Model ◽

Amino Acid Residues ◽

Post Translational Modification ◽

High Quality ◽

Glycoprotein Structure

Glycosylation is one of the most common forms of protein post-translational modification, but is also the most complex. Dealing with glycoproteins in structure model building, refinement, validation and PDB deposition is more error-prone than dealing with nonglycosylated proteins owing to limitations of the experimental data and available software tools. Also, experimentalists are typically less experienced in dealing with carbohydrate residues than with amino-acid residues. The results of the reannotation and re-refinement byPDB-REDOof 8114 glycoprotein structure models from the Protein Data Bank are analyzed. The positive aspects of 3620 reannotations and subsequent refinement, as well as the remaining challenges to obtaining consistently high-quality carbohydrate models, are discussed.

Download Full-text

A chemical interpretation of protein electron density maps in the worldwide protein data bank

PLoS ONE ◽

10.1371/journal.pone.0236894 ◽

2020 ◽

Vol 15 (8) ◽

pp. e0236894

Author(s):

Sen Yao ◽

Hunter N. B. Moseley

Keyword(s):

Protein Data Bank ◽

Electron Density ◽

Data Bank ◽

Density Maps

Download Full-text

Interactions of Aromatic Residues in Amyloids: A Survey of Protein Data Bank Crystallographic Data

Crystal Growth & Design ◽

10.1021/acs.cgd.7b01035 ◽

2017 ◽

Vol 17 (12) ◽

pp. 6353-6362 ◽

Cited By ~ 7

Author(s):

Ivana M. Stanković ◽

Dragana M. Božinovski ◽

Edward N. Brothers ◽

Milivoj R. Belić ◽

Michael B. Hall ◽

...

Keyword(s):

Protein Data Bank ◽

Data Bank ◽

Crystallographic Data ◽

Aromatic Residues

Download Full-text

Finding high-quality metal ion-centric regions across the worldwide Protein Data Bank

10.1101/619809 ◽

2019 ◽

Author(s):

Sen Yao ◽

Hunter N.B. Moseley

Keyword(s):

Protein Data Bank ◽

Electron Density ◽

Metal Binding ◽

Metal Ion ◽

Data Bank ◽

Structural Quality ◽

Global Evaluation ◽

Density Maps ◽

Quality Control Criteria ◽

Control Criteria

AbstractAs the number of macromolecular structures in the worldwide Protein Data Bank (wwPDB) continues to grow rapidly, more attention is being paid to the quality of its data, especially for use in aggregated structural and dynamics analyses. In this study, we systematically analyzed 3.5 Å regions around all metal ions across all PDB entries with supporting electron density maps available from the PDB in Europe. All resulting metal ion-centric regions were evaluated with respect to four quality-control criteria involving electron density resolution, atom occupancy, symmetry atom exclusion, and regional electron density discrepancy. The resulting list of metal binding sites passing all four criteria possess high regional structural quality and should be beneficial to a wide variety of downstream analyses. This study demonstrates an approach for the pan-PDB evaluation of metal binding site structural quality with respect to underlying x-ray crystallographic experimental data represented in available electron density maps of proteins. For non-crystallographers in particular, we hope to change the focus and discussion of structural quality from a global evaluation to a regional evaluation, since all structural entries in the wwPDB appear to have both regions of high and low structural quality.

Download Full-text