scholarly journals Prediction of models for ordered solvent in macromolecular structures by a classifier based upon resolution-independent projections of local feature data

2019 ◽  
Vol 75 (8) ◽  
pp. 696-717
Author(s):  
Laurel Jones ◽  
Michael Tynes ◽  
Paul Smith

Current software tools for the automated building of models for macromolecular X-ray crystal structures are capable of assembling high-quality models for ordered macromolecule and small-molecule scattering components with minimal or no user supervision. Many of these tools also incorporate robust functionality for modelling the ordered water molecules that are found in nearly all macromolecular crystal structures. However, no current tools focus on differentiating these ubiquitous water molecules from other frequently occurring multi-atom solvent species, such as sulfate, or the automated building of models for such species. PeakProbe has been developed specifically to address the need for such a tool. PeakProbe predicts likely solvent models for a given point (termed a `peak') in a structure based on analysis (`probing') of its local electron density and chemical environment. PeakProbe maps a total of 19 resolution-dependent features associated with electron density and two associated with the local chemical environment to a two-dimensional score space that is independent of resolution. Peaks are classified based on the relative frequencies with which four different classes of solvent (including water) are observed within a given region of this score space as determined by large-scale sampling of solvent models in the Protein Data Bank. Designed to classify peaks generated from difference density maxima, PeakProbe also incorporates functionality for identifying peaks associated with model errors or clusters of peaks likely to correspond to multi-atom solvent, and for the validation of existing solvent models using solvent-omit electron-density maps. When tasked with classifying peaks into one of four distinct solvent classes, PeakProbe achieves greater than 99% accuracy for both peaks derived directly from the atomic coordinates of existing solvent models and those based on difference density maxima. While the program is still under development, a fully functional version is publicly available. PeakProbe makes extensive use of cctbx libraries, and requires a PHENIX licence and an up-to-date phenix.python environment for execution.

2014 ◽  
Vol 70 (10) ◽  
pp. 2533-2543 ◽  
Author(s):  
Thomas C. Terwilliger ◽  
Gerard Bricogne

Accurate crystal structures of macromolecules are of high importance in the biological and biomedical fields. Models of crystal structures in the Protein Data Bank (PDB) are in general of very high quality as deposited. However, methods for obtaining the best model of a macromolecular structure from a given set of experimental X-ray data continue to progress at a rapid pace, making it possible to improve most PDB entries after their deposition by re-analyzing the original deposited data with more recent software. This possibility represents a very significant departure from the situation that prevailed when the PDB was created, when it was envisioned as a cumulative repository of static contents. A radical paradigm shift for the PDB is therefore proposed, away from the static archive model towards a much more dynamic body of continuously improving results in symbiosis with continuously improving methods and software. These simultaneous improvements in methods and final results are made possible by the current deposition of processed crystallographic data (structure-factor amplitudes) and will be supported further by the deposition of raw data (diffraction images). It is argued that it is both desirable and feasible to carry out small-scale and large-scale efforts to make this paradigm shift a reality. Small-scale efforts would focus on optimizing structures that are of interest to specific investigators. Large-scale efforts would undertake a systematic re-optimization of all of the structures in the PDB, or alternatively the redetermination of groups of structures that are either related to or focused on specific questions. All of the resulting structures should be made generally available, along with the precursor entries, with various views of the structures being made available depending on the types of questions that users are interested in answering.


2008 ◽  
Vol 41 (4) ◽  
pp. 761-767 ◽  
Author(s):  
Eric N. Brown

Atomic structures of proteins determinedviaprotein crystallography contain numerous solvent atoms. The experimental data for the determination of a water molecule's O-atom position is often a small contained blob of unidentified electron density. Unfortunately, the nature of crystallographic refinement lets poorly placed solvent atoms bias the future refined positions of all atoms in the crystal structure. This research article presents the technique of omit-maps applied to remove the bias introduced by poorly determined solvent atoms, enabling the identification of incorrectly placed water molecules in partially refined crystal structures. A total of 160 protein crystal structures with 45 912 distinct water molecules were processed using this technique. Most of the water molecules in the deposited structures were well justified. However, a few of the solvent atoms in this test data set changed appreciably in position, displacement parameter or electron density when fitted to the solvent omit-map, raising questions about how much experimental support exists for these solvent atoms.


2015 ◽  
Vol 71 (11) ◽  
pp. 2203-2216 ◽  
Author(s):  
Munenori Furuse ◽  
Jun Tamogami ◽  
Toshiaki Hosaka ◽  
Takashi Kikukawa ◽  
Naoko Shinya ◽  
...  

Although many crystal structures of microbial rhodopsins have been solved, those with sufficient resolution to identify the functional water molecules are very limited. In this study, the Acetabularia rhodopsin I (ARI) protein derived from the marine alga A. acetabulum was synthesized on a large scale by the Escherichia coli cell-free membrane-protein production method, and crystal structures of ARI were determined at the second highest (1.52–1.80 Å) resolution for a microbial rhodopsin, following bacteriorhodopsin (BR). Examinations of the photochemical properties of ARI revealed that the photocycle of ARI is slower than that of BR and that its proton-transfer reactions are different from those of BR. In the present structures, a large cavity containing numerous water molecules exists on the extracellular side of ARI, explaining the relatively low pK a of Glu206ARI, which cannot function as an initial proton-releasing residue at any pH. An interhelical hydrogen bond exists between Leu97ARI and Tyr221ARI on the cytoplasmic side, which facilitates the slow photocycle and regulates the pK a of Asp100ARI, a potential proton donor to the Schiff base, in the dark state.


IUCrJ ◽  
2014 ◽  
Vol 1 (3) ◽  
pp. 179-193 ◽  
Author(s):  
Zbigniew Dauter ◽  
Alexander Wlodawer ◽  
Wladek Minor ◽  
Mariusz Jaskolski ◽  
Bernhard Rupp

Whereas the vast majority of the more than 85 000 crystal structures of macromolecules currently deposited in the Protein Data Bank are of high quality, some suffer from a variety of imperfections. Although this fact has been pointed out in the past, it is still worth periodic updates so that the metadata obtained by global analysis of the available crystal structures, as well as the utilization of the individual structures for tasks such as drug design, should be based on only the most reliable data. Here, selected abnormal deposited structures have been analysed based on the Bayesian reasoning that the correctness of a model must be judged against both the primary evidence as well as prior knowledge. These structures, as well as information gained from the corresponding publications (if available), have emphasized some of the most prevalent types of common problems. The errors are often perfect illustrations of the nature of human cognition, which is frequently influenced by preconceptions that may lead to fanciful results in the absence of proper validation. Common errors can be traced to negligence and a lack of rigorous verification of the models against electron density, creation of non-parsimonious models, generation of improbable numbers, application of incorrect symmetry, illogical presentation of the results, or violation of the rules of chemistry and physics. Paying more attention to such problems, not only in the final validation stages but during the structure-determination process as well, is necessary not only in order to maintain the highest possible quality of the structural repositories and databases but most of all to provide a solid basis for subsequent studies, including large-scale data-mining projects. For many scientists PDB deposition is a rather infrequent event, so the need for proper training and supervision is emphasized, as well as the need for constant alertness of reason and critical judgment as absolutely necessary safeguarding measures against such problems. Ways of identifying more problematic structures are suggested so that their users may be properly alerted to their possible shortcomings.


1990 ◽  
Vol 68 (8) ◽  
pp. 1277-1282 ◽  
Author(s):  
Ivor Wharf ◽  
Michel G. Simard ◽  
Henry Lamparski

Tetrakis(p-methylsulphonylphenyl)tin(IV) and tetrakis(p-methylsulphinylphenyl)tin(IV) n-hydrate have been prepared and their spectra (ir 1350–400 cm−1; nmr, 1H, 13C, 119Sn) and X-ray crystal structures are reported. The first compound is monoclinic, space group C2/c, Z = 4, with a = 21.589(6), b = 6.207(3), c = 22.861(11) Å, β = 93.80(3)° (22 °C); the structure was solved by the direct method and refined by full-matrix least squares calculations to R = 0.043 for 2755 observed reflections. It has 2 molecular symmetry with the methyl group and one oxygen atom completely disordered in both CH3S(O2) groups in the asymmetric unit. The second compound is tetragonal, space group P42/n, Z = 2, with a = b = 15.408(6), c = 6.379(2) Å (−100 °C); the structure was solved by the Patterson method and refined by full-matrix least squares calculations to R = 0.060 for 1209 observed reflections. It has [Formula: see text] molecular symmetry with the whole asymmetric unit disordered. Water molecules occupy positions on parallel 42 axes but molecular packing requirements prevent all sites having 100% occupancy giving n ~ 1 for the hydrate. Keywords: Tetra-aryltins, crystal structures, sulphone, sulphoxide, hydrogen-bonding.


1995 ◽  
Vol 50 (5) ◽  
pp. 828-832 ◽  
Author(s):  
Joachim Pickardt ◽  
Isabella Hoffmeister

Abstract Crystals of both complexes were obtained by evaporation of the ethanol solvent. The crystals of [{CuCl(C10N4H24)}2][CdCl4] are tetragonal, space group I4̄2d, Z = 4, a = b = 1784.1(11), c = 1101.1(8) pm. Each copper atom is bonded to one cyclam ligand and two chlorine atoms which are acting as bridging ligands and connect the copper atoms to chains of distorted octahedra. Distorted tetrahedra of CdCl4 are situated in cavities between these chains. The crystals of [Cu(C10N4H24)][CdCl3(H2O)2]Cl are monoclinic (b), space group C2/c, Z = 4, a = 1581.9(8), b = 1323.3(7), c = 924.0(5) pm, β = 94.31(5)°. Cadmium is coordinated to four chlorine atoms and two water molecules, while all of the chlorine atoms act as bridging ligands connecting every cadmium atom to two adjacent cadmium atoms and to two copper atoms which lie in plane with the N atoms.


2018 ◽  
Vol 33 (2) ◽  
pp. 98-107 ◽  
Author(s):  
James A. Kaduk

The crystal structures of calcium citrate hexahydrate, calcium citrate tetrahydrate, and anhydrous calcium citrate have been solved using laboratory and synchrotron X-ray powder diffraction data, and optimized using density functional techniques. Both the hexahydrate and tetrahydrate structures are characterized by layers of edge-sharing Ca coordination polyhedra, including triply chelated Ca. An additional isolated Ca is coordinated by water molecules, and two uncoordinated water molecules occur in the hexahydrate structure. The previously reported polymorph of the tetrahydrate contains the same layers, but only two H2O coordinated to the isolated Ca and two uncoordinated water molecules. Anhydrous calcium citrate has a three-dimensional network structure of Ca coordination polyhedra. The new polymorph of calcium citrate tetrahydrate is the major crystalline phase in several commercial calcium supplements.


2021 ◽  
Author(s):  
Gregory Wagner ◽  
Andre Souza ◽  
Adeline Hillier ◽  
Ali Ramadhan ◽  
Raffaele Ferrari

<p>Parameterizations of turbulent mixing in the ocean surface boundary layer (OSBL) are key Earth System Model (ESM) components that modulate the communication of heat and carbon between the atmosphere and ocean interior. OSBL turbulence parameterizations are formulated in terms of unknown free parameters estimated from observational or synthetic data. In this work we describe the development and use of a synthetic dataset called the “LESbrary” generated by a large number of idealized, high-fidelity, limited-area large eddy simulations (LES) of OSBL turbulent mixing. We describe how the LESbrary design leverages a detailed understanding of OSBL conditions derived from observations and large scale models to span the range of realistically diverse physical scenarios. The result is a diverse library of well-characterized “synthetic observations” that can be readily assimilated for the calibration of realistic OSBL parameterizations in isolation from other ESM model components. We apply LESbrary data to calibrate free parameters, develop prior estimates of parameter uncertainty, and evaluate model errors in two OSBL parameterizations for use in predictive ESMs.</p>


2016 ◽  
Vol 72 (10) ◽  
pp. 1110-1118 ◽  
Author(s):  
Wouter G. Touw ◽  
Bart van Beusekom ◽  
Jochem M. G. Evers ◽  
Gert Vriend ◽  
Robbie P. Joosten

Many crystal structures in the Protein Data Bank contain zinc ions in a geometrically distorted tetrahedral complex with four Cys and/or His ligands. A method is presented to automatically validate and correct these zinc complexes. Analysis of the corrected zinc complexes shows that the average Zn–Cys distances and Cys–Zn–Cys angles are a function of the number of cysteines and histidines involved. The observed trends can be used to develop more context-sensitive targets for model validation and refinement.


Sign in / Sign up

Export Citation Format

Share Document