scholarly journals Worldwide Protein Data Bank validation information: usage and trends

2018 ◽  
Vol 74 (3) ◽  
pp. 237-244 ◽  
Author(s):  
Oliver S. Smart ◽  
Vladimír Horský ◽  
Swanand Gore ◽  
Radka Svobodová Vařeková ◽  
Veronika Bendová ◽  
...  

Realising the importance of assessing the quality of the biomolecular structures deposited in the Protein Data Bank (PDB), the Worldwide Protein Data Bank (wwPDB) partners established Validation Task Forces to obtain advice on the methods and standards to be used to validate structures determined by X-ray crystallography, nuclear magnetic resonance spectroscopy and three-dimensional electron cryo-microscopy. The resulting wwPDB validation pipeline is an integral part of the wwPDB OneDep deposition, biocuration and validation system. The wwPDB Validation Service webserver (https://validate.wwpdb.org) can be used to perform checks prior to deposition. Here, it is shown how validation metrics can be combined to produce an overall score that allows the ranking of macromolecular structures and domains in search results. The ValTrendsDBdatabase provides users with a convenient way to access and analyse validation information and other properties of X-ray crystal structures in the PDB, including investigating trends in and correlations between different structure properties and validation metrics.

2012 ◽  
Vol 68 (4) ◽  
pp. 478-483 ◽  
Author(s):  
Swanand Gore ◽  
Sameer Velankar ◽  
Gerard J. Kleywegt

There is an increasing realisation that the quality of the biomacromolecular structures deposited in the Protein Data Bank (PDB) archive needs to be assessed critically using established and powerful validation methods. The Worldwide Protein Data Bank (wwPDB) organization has convened several Validation Task Forces (VTFs) to advise on the methods and standards that should be used to validate all of the entries already in the PDB as well as all structures that will be deposited in the future. The recommendations of the X-ray VTF are currently being implemented in a software pipeline. Here, ongoing work on this pipeline is briefly described as well as ways in which validation-related information could be presented to users of structural data.


2014 ◽  
Vol 70 (a1) ◽  
pp. C1478-C1478
Author(s):  
Swanand Gore ◽  
Pieter Hendrickx ◽  
Eduardo Sanz-Garcia ◽  
Sameer Velankar ◽  
Gerard Kleywegt

The Protein Data Bank (PDB) is the single global archive of 3D biomacromolecular structure data. The archive is managed by the Worldwide Protein Data Bank (wwPDB; wwpdb.org) organisation through its partners, the Research Collaboratory for Structural Bioinformatics (RCSB PDB), the Protein Data Bank Japan (PDBj), the Protein Data Bank in Europe and the Biological Magnetic Resonance Bank (BMRB). Analogously, the Electron Microscopy Data Bank (EMDB) is managed by the EMDataBank (emdatabank.org) organisation. A few years ago, realising the needs and opportunities to assess the quality of biomacromolecular structures deposited in the PDB, the wwPDB and EMDataBank partners established Validation Task Forces (VTFs) to advice them on up-to-date and community-agreed methods and standards to validate X-ray, NMR and 3DEM structures and data. All three VTFs have now published their recommendations (1, 2, 3) and these are getting implemented as validation-software pipelines . The pipelines are integrated in the new joint wwPDB deposition and annotation system (http://deposit.wwpdb.org/deposition/). In addition, stand-alone servers are provided to allow practising structural biologists to validate models prior to publication and deposition (http://wwpdb.org/validation-servers.html). The validation pipelines and the output they produce (human-readable PDF reports and machine-readable XML files) will be described.


2018 ◽  
Vol 74 (3) ◽  
pp. 228-236 ◽  
Author(s):  
Oliver S. Smart ◽  
Vladimír Horský ◽  
Swanand Gore ◽  
Radka Svobodová Vařeková ◽  
Veronika Bendová ◽  
...  

Crystallographic studies of ligands bound to biological macromolecules (proteins and nucleic acids) play a crucial role in structure-guided drug discovery and design, and also provide atomic level insights into the physical chemistry of complex formation between macromolecules and ligands. The quality with which small-molecule ligands have been modelled in Protein Data Bank (PDB) entries has been, and continues to be, a matter of concern for many investigators. Correctly interpreting whether electron density found in a binding site is compatible with the soaked or co-crystallized ligand or represents water or buffer molecules is often far from trivial. The Worldwide PDB validation report (VR) provides a mechanism to highlight any major issues concerning the quality of the data and the model at the time of deposition and annotation, so the depositors can fix issues, resulting in improved data quality. The ligand-validation methods used in the generation of the current VRs are described in detail, including an examination of the metrics to assess both geometry and electron-density fit. It is found that the LLDF score currently used to identify ligand electron-density fit outliers can give misleading results and that better ligand-validation metrics are required.


2013 ◽  
Vol 69 (12) ◽  
pp. 2293-2295 ◽  
Author(s):  
Robbie P. Joosten ◽  
Hayssam Soueidan ◽  
Lodewyk F. A. Wessels ◽  
Anastassis Perrakis

Most of the macromolecular structures in the Protein Data Bank (PDB), which are used daily by thousands of educators and scientists alike, are determined by X-ray crystallography. It was examined whether the crystallographic models and data were deposited to the PDB at the same time as the publications that describe them were submitted for peer review. This condition is necessary to ensure pre-publication validation and the quality of the PDB public archive. It was found that a significant proportion of PDB entries were submitted to the PDB after peer review of the corresponding publication started, and many were only submitted after peer review had ended. It is argued that clear description of journal policies and effective policing is important for pre-publication validation, which is key in ensuring the quality of the PDB and of peer-reviewed literature.


2021 ◽  
Author(s):  
Bulat Faezov ◽  
Roland L. Dunbrack

AbstractThe Protein Data Bank (PDB) was established at Brookhaven National Laboratories in 1971 as an archive for biological macromolecular crystal structures. In the beginning the archive held only seven structures but in early 2021, the database has more than 170,000 structures solved by X-ray crystallography, nuclear magnetic resonance, cryo-electron microscopy, and other methods. Many proteins have been studied under different conditions (e.g., binding partners such as ligands, nucleic acids, or other proteins; mutations and post-translational modifications), thus enabling comparative structure-function studies. However, these studies are made more difficult because authors are allowed by the PDB to number the amino acids in each protein sequence in any manner they wish. This results in the same protein being numbered differently in the available PDB entries. In addition to the coordinates, there are many fields that contain information regarding specific residues in the sequence of each protein in the entry. Here we provide a webserver and Python3 application that fixes the PDB sequence numbering problem by replacing the author numbering with numbering derived from the corresponding UniProt sequences. We obtain this correspondence from the SIFTS database from PDBe. The server and program can take a list of PDB entries and provide renumbered files in mmCIF format and the legacy PDB format for both asymmetric unit files and biological assembly files provided by PDBe. The server can also take a list of UniProt identifiers (“P04637” or “P53_HUMAN”) and return the desired files.AvailabilitySource code is freely available at https://github.com/Faezov/PDBrenum. The webserver is located at: http://dunbrack3.fccc.edu/[email protected] or [email protected].


2020 ◽  
Vol 76 (12) ◽  
pp. 1184-1191
Author(s):  
Lum Wang ◽  
Holger Kruse ◽  
Oleg V. Sobolev ◽  
Nigel W. Moriarty ◽  
Mark P. Waller ◽  
...  

Electron cryo-microscopy (cryo-EM) is rapidly becoming a major competitor to X-ray crystallography, especially for large structures that are difficult or impossible to crystallize. While recent spectacular technological improvements have led to significantly higher resolution three-dimensional reconstructions, the average quality of cryo-EM maps is still at the low-resolution end of the range compared with crystallography. A long-standing challenge for atomic model refinement has been the production of stereochemically meaningful models for this resolution regime. Here, it is demonstrated that including accurate model geometry restraints derived from ab initio quantum-chemical calculations (HF-D3/6-31G) can improve the refinement of an example structure (chain A of PDB entry 3j63). The robustness of the procedure is tested for additional structures with up to 7000 atoms (PDB entry 3a5x and chain C of PDB entry 5fn5) using the less expensive semi-empirical (GFN1-xTB) model. The necessary algorithms enabling real-space quantum refinement have been implemented in the latest version of qr.refine and are described here.


2016 ◽  
Vol 62 (3) ◽  
pp. 286-297
Author(s):  
Zygmunt S. Derewenda

Macromolecular X-ray crystallography has undergone a dramatic and astonishing transformation since its inception in mid 1950s, almost exclusively owing to the developments in three other fields: computer science; synchrotron radiation; and molecular biology. The process of structure solution from a single crystal, provided the quality of diffraction data is adequate, has been shortened from many years to hours, if not minutes. Yet, in spite of the exponential increase in the available structural information (~120,000 structures in the Protein Data Bank today), many fundamental problems continue to be the subject of scientific controversy. This article contains personal recollections of the author, pertaining to two research projects – conducted nearly four decades apart – both of which touch upon such long standing discussion of the Monod-Wyman-Changeux theory of cooperativity (or ‘conformational selection’) vs the Koshland-Nemethy-Filmer theory of ‘induced fit’. It is dedicated to Dr. Alexander Wlodawer on his 70th birthday, with best wishes of continuing success.


2021 ◽  
Author(s):  
Zhe Wang ◽  
Ardan Patwardhan ◽  
Gerard J Kleywegt

The Electron Microscopy Data Bank (EMDB) is the central archive of the electron cryo-microscopy (cryo-EM) community for storing and disseminating volume maps and tomograms. With input from the community, EMDB has developed new resources for validation of cryo-EM structures, focussing on the quality of the volume data alone and that of the fit of any models, themselves archived in the Protein Data Bank (PDB), to the volume data. Based on recommendations from community experts, the validation resources are developed in a three-tiered system. Tier 1 covers an extensive and evolving set of validation metrics, including tried and tested as well as more experimental ones, which are calculated for all EMDB entries and presented in the Validation Analysis (VA) web resource. This system is particularly useful for cryo-EM experts, both to validate individual structures and to assess the utility of new validation metrics. Tier 2 comprises a subset of the validation metrics covered by the VA resource that have been subjected to extensive testing and are considered to be useful for specialists as well as non-specialists. These metrics are presented on the entry-specific web pages for the entire archive on the EMDB website. As more experience is gained with the metrics included in the VA resource, it is expected that consensus will emerge in the community regarding a subset that is suitable for inclusion in the tier 2 system. Tier 3, finally, consists of the validation reports and servers that are produced by the Worldwide Protein Data Bank (wwPDB) Consortium. Successful metrics from tier 2 will be proposed for inclusion in the wwPDB validation pipeline and reports. We describe the details of the new resource, with an emphasis on the tier 1 system. The output of all three tiers is publicly available, either through the EMDB website (tiers 1 and 2) or through the wwPDB ftp sites (tier 3), although the content of all three will evolve over time (fastest for tier 1 and slowest for tier 3). It is our hope that these validation resources will help the cryo-EM community to get a better understanding of the quality, and the best ways to assess the quality of cryo-EM structures in EMDB and PDB.


2020 ◽  
Vol 76 (5) ◽  
pp. 400-405 ◽  
Author(s):  
John H. Beale

The number of new X-ray crystallography-based submissions to the Protein Data Bank appears to be at the beginning of a decline, perhaps signalling an end to the era of the dominance of X-ray crystallography within structural biology. This letter, from the viewpoint of a young structural biologist, applies the Copernican method to the life expectancy of crystallography and asks whether the technique is still the mainstay of structural biology. A study of the rate of Protein Data Bank depositions allows a more nuanced analysis of the fortunes of macromolecular X-ray crystallography and shows that cryo-electron microscopy might now be outcompeting crystallography for new labour and talent, perhaps heralding a change in the landscape of the field.


Author(s):  
Michael Duszenko ◽  
Lars Redecke ◽  
Celestin Nzanzu Mudogo ◽  
Benjamin Philip Sommer ◽  
Stefan Mogk ◽  
...  

During the last decade, the number of three-dimensional structures solved by X-ray crystallography has increased dramatically. By 2014, it had crossed the landmark of 100 000 biomolecular structures deposited in the Protein Data Bank. This tremendous increase in successfully crystallized proteins is primarily owing to improvements in cloning strategies, the automation of the crystallization process and new innovative approaches to monitor crystallization. However, these improvements are mainly restricted to soluble proteins, while the crystallization and structural analysis of membrane proteins or proteins that undergo major post-translational modifications remains challenging. In addition, the need for relatively large crystals for conventional X-ray crystallography usually prevents the analysis of dynamic processes within cells. Thus, the advent of high-brilliance synchrotron and X-ray free-electron laser (XFEL) sources and the establishment of serial crystallography (SFX) have opened new avenues in structural analysis using crystals that were formerly unusable. The successful structure elucidation of cathepsin B, accomplished by the use of microcrystals obtained byin vivocrystallization in baculovirus-infected Sf9 insect cells, clearly proved that crystals grown intracellularly are very well suited for X-ray analysis. Here, methods by whichin vivocrystals can be obtained, isolated and used for structural analysis by novel highly brilliant XFEL and synchrotron-radiation sources are summarized and discussed.


Sign in / Sign up

Export Citation Format

Share Document