THE RAMACHANDRAN MAP OF MORE THAN 6,500 PERFECT POLYPEPTIDE CHAINS

2007 ◽  
Vol 02 (03n04) ◽  
pp. 267-271
Author(s):  
ZOLTÁN SZABADKA ◽  
RAFAEL ÖRDÖG ◽  
VINCE GROLMUSZ

The Protein Data Bank (PDB) is the most important depository of protein structural information, containing more than 45,000 deposited entries today. Because of its inhomogeneous structure, its fully automated processing is almost impossible. In a previous work, we cleaned and re-structured the entries in the Protein Data Bank, and from the result we have built the RS-PDB database. Using the RS-PDB database, we draw a Ramachandran-plot from 6,593 "perfect" polypeptide chains found in the PDB, containing 1,192,689 residues. This is a more than tenfold increase in the size of data analyzed before this work. The density of the data points makes it possible to draw a logarithmic heat map enhanced Ramachandran map, showing the fine inner structure of the right-handed α-helix region.

1998 ◽  
Vol 54 (6) ◽  
pp. 1078-1084 ◽  
Author(s):  
Joel L. Sussman ◽  
Dawei Lin ◽  
Jiansheng Jiang ◽  
Nancy O. Manning ◽  
Jaime Prilusky ◽  
...  

The Protein Data Bank (PDB) at Brookhaven National Laboratory, is a database containing experimentally determined three-dimensional structures of proteins, nucleic acids and other biological macromolecules, with approximately 8000 entries. Data are easily submittedviaPDB's WWW-based toolAutoDep, in either mmCIF or PDB format, and are most conveniently examinedviaPDB's WWW-based tool3DB Browser.


2015 ◽  
Vol 11 (1) ◽  
pp. 1-7 ◽  
Author(s):  
Michal Brylinski

AbstractThe Protein Data Bank (PDB) undergoes an exponential expansion in terms of the number of macromolecular structures deposited every year. A pivotal question is how this rapid growth of structural information improves the quality of three-dimensional models constructed by contemporary bioinformatics approaches. To address this problem, we performed a retrospective analysis of the structural coverage of a representative set of proteins using remote homology detected by COMPASS and HHpred. We show that the number of proteins whose structures can be confidently predicted increased during a 9-year period between 2005 and 2014 on account of the PDB growth alone. Nevertheless, this encouraging trend slowed down noticeably around the year 2008 and has yielded insignificant improvements ever since. At the current pace, it is unlikely that the protein structure prediction problem will be solved in the near future using existing template-based modeling techniques. Therefore, further advances in experimental structure determination, qualitatively better approaches in fold recognition, and more accurate template-free structure prediction methods are desperately needed.


2019 ◽  
Vol 35 (20) ◽  
pp. 4165-4167 ◽  
Author(s):  
Jonathan Fine ◽  
Gaurav Chopra

Abstract Motivation The Protein Data Bank (PDB) currently holds over 140 000 biomolecular structures and continues to release new structures on a weekly basis. The PDB is an essential resource to the structural bioinformatics community to develop software that mine, use, categorize and analyze such data. New computational biology methods are evaluated using custom benchmarking sets derived as subsets of 3D experimentally determined structures and structural features from the PDB. Currently, such benchmarking features are manually curated with custom scripts in a non-standardized manner that results in slow distribution and updates with new experimental structures. Finally, there is a scarcity of standardized tools to rapidly query 3D descriptors of the entire PDB. Results Our solution is the Lemon framework, a C++11 library with Python bindings, which provides a consistent workflow methodology for selecting biomolecular interactions based on user criterion and computing desired 3D structural features. This framework can parse and characterize the entire PDB in <10 min on modern, multithreaded hardware. The speed in parsing is obtained by using the recently developed MacroMolecule Transmission Format to reduce the computational cost of reading text-based PDB files. The use of C++ lambda functions and Python bindings provide extensive flexibility for analysis and categorization of the PDB by allowing the user to write custom functions to suite their objective. We think Lemon will become a one-stop-shop to quickly mine the entire PDB to generate desired structural biology features. Availability and implementation The Lemon software is available as a C++ header library along with a PyPI package and example functions at https://github.com/chopralab/lemon. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Jonathan Fine ◽  
Gaurav Chopra

AbstractMotivationThe protein data bank (PDB) currently holds over 140,000 biomolecular structures and continues to release new structures on a weekly basis. The PDB is an essential resource to the structural bioinformatics community to develop software that mine, use, categorize, and analyze such data. New computational biology methods are evaluated using custom benchmarking sets derived as subsets of 3D experimentally determined structures and structural features from the PDB. Currently, such benchmarking features are manually curated with custom scripts in a non-standardized manner that results in slow distribution and updates with new experimental structures. Finally, there is a scarcity of standardized tools to rapidly query 3D descriptors of the entire PDB.ApproachOur solution is the Lemon framework, a C++11 library with Python bindings, which provides a consistent workflow methodology for selecting biomolecular interactions based on user criterion and computing desired 3D structural features. This framework can parse and characterize the entire PDB in less than ten minutes on modern, multithreaded hardware. The speed in parsing is obtained by using the recently developed MacroMolecule Transmission Format (MMTF) to reduce the computational cost of reading text-based PDB files. The use of C++ lambda functions and Python binds provide extensive flexibility for analysis and categorization of the PDB by allowing the user to write custom functions to suite their objective. We think Lemon will become a one-stop-shop to quickly mine the entire PDB to generate desired structural biology features. The Lemon software is available as a C++ header library along with example functions at https://github.com/chopralab/lemon.


Author(s):  
Joel L. Sussman ◽  
Frances C. Bernstein ◽  
Jiansheng Jiang ◽  
Michael Libeson ◽  
Dawei Lin ◽  
...  

2021 ◽  
Author(s):  
Nicholas J Fowler ◽  
Adnan Sljoka ◽  
Mike P Williamson

We recently described a method, ANSURR, for measuring the accuracy of NMR protein structures. It is based on comparing residue-specific measures of rigidity from backbone chemical shifts via the random coil index, and from structures. Here, we report the use of ANSURR to analyse NMR ensembles within the Protein Data Bank (PDB). NMR structures cover a wide range of accuracy, which improved over time until about 2005, since when accuracy has not improved. Most structures have accurate secondary structure, but are too floppy, particularly in loops. There is a need for more experimental restraints in loops. The best current accuracy measures are Ramachandran distribution and number of NOE restraints per residue. The precision of structure ensembles correlates with accuracy, as does the number of hydrogen bond restraints per residue. If a structure contains additional components (such as additional polypeptide chains or ligands), then their inclusion improves accuracy. Analysis of over 7000 PDB NMR ensembles is available via our website ansurr.com.


2020 ◽  
Vol 27 (8) ◽  
pp. 763-769
Author(s):  
Oliviero Carugo

Background: Despite the fact that lithium is not a biologically essential metallic element, its pharmacological properties are well known and human exposure to lithium is increasingly possible because of its used in aerospace industry and in batteries. Objective: Lithium-protein interactions are therefore interesting and the surveys of the structures of lithium-protein complexes is described in this paper. Methods: A high quality non-redundant set of lithium containing protein crystal structures was extracted from the Protein Data Bank and the stereochemistry of the lithium first coordination sphere was examined in detail. Results: Four main observations were reported: (i) lithium interacts preferably with oxygen atoms; (ii) preferably with side-chain atoms; (iii) preferably with Asp or Glu carboxylates; (iv) the coordination number tends to be four with stereochemical parameters similar to those observed in small molecules containing lithium. Conclusion: Although structural information on lithium-protein, available from the Protein Data Bank, is relatively scarce, these trends appears to be so clear that one may suppose that they will be confirmed by further data that will join the Protein Data Bank in the future.


Sign in / Sign up

Export Citation Format

Share Document