scholarly journals Distance-based metrics for comparing conformational ensembles of intrinsically disordered proteins

2020 ◽  
Author(s):  
Tamas Lazar ◽  
Mainak Guharoy ◽  
Wim Vranken ◽  
Sarah Rauscher ◽  
Shoshana J. Wodak ◽  
...  

AbstractIntrinsically disordered proteins (IDPs) are proteins whose native functional states represent ensembles of highly diverse conformations. Such ensembles are a challenge for quantitative structure comparisons as their conformational diversity precludes optimal superimposition of the atomic coordinates, necessary for deriving common similarity measures such as the root-mean-square deviation (RMSD) of these coordinates. Here we introduce superimposition-free metrics, which are based on computing matrices of Cα-Cα distance distributions within ensembles and comparing these matrices between ensembles. Differences between two matrices yield information on the similarity between specific regions of the polypeptide, whereas the global structural similarity is captured by the ens_dRMS, defined as the root-mean-square difference between the medians of the Cα-Cαdistance distributions of two ensembles. Together, our metrics enable rigorous investigations of structure-function relationships in conformational ensembles of IDPs derived using experimental restraints or by molecular simulations, and for proteins containing both structured and disordered regions.Statement of SignificanceImportant biological insight is obtained from comparing the high-resolution structures of proteins. Such comparisons commonly involve superimposing two protein structures and computing the residual root-mean-square deviation of the atomic positions. This approach cannot be applied to intrinsically disordered proteins (IDPs) because IDPs do not adopt well-defined 3D structures, rather, their native functional state is defined by ensembles of heterogeneous conformations that cannot be meaningfully superimposed. We report new measures that quantify the local and global similarity between different conformational ensembles by evaluating differences between the distributions of residue-residue distances and their statistical significance. Applying these measures to IDP ensembles and to a protein containing both structured and intrinsically disordered domains provides deeper insights into how structural features relate to function.

2020 ◽  
Vol 117 (38) ◽  
pp. 23356-23364 ◽  
Author(s):  
Micayla A. Bowman ◽  
Joshua A. Riback ◽  
Anabel Rodriguez ◽  
Hongyu Guo ◽  
Jun Li ◽  
...  

Much attention is being paid to conformational biases in the ensembles of intrinsically disordered proteins. However, it is currently unknown whether or how conformational biases within the disordered ensembles of foldable proteins affect function in vivo. Recently, we demonstrated that water can be a good solvent for unfolded polypeptide chains, even those with a hydrophobic and charged sequence composition typical of folded proteins. These results run counter to the generally accepted model that protein folding begins with hydrophobicity-driven chain collapse. Here we investigate what other features, beyond amino acid composition, govern chain collapse. We found that local clustering of hydrophobic and/or charged residues leads to significant collapse of the unfolded ensemble of pertactin, a secreted autotransporter virulence protein fromBordetella pertussis, as measured by small angle X-ray scattering (SAXS). Sequence patterns that lead to collapse also correlate with increased intermolecular polypeptide chain association and aggregation. Crucially, sequence patterns that support an expanded conformational ensemble enhance pertactin secretion to the bacterial cell surface. Similar sequence pattern features are enriched across the large and diverse family of autotransporter virulence proteins, suggesting sequence patterns that favor an expanded conformational ensemble are under selection for efficient autotransporter protein secretion, a necessary prerequisite for virulence. More broadly, we found that sequence patterns that lead to more expanded conformational ensembles are enriched across water-soluble proteins in general, suggesting protein sequences are under selection to regulate collapse and minimize protein aggregation, in addition to their roles in stabilizing folded protein structures.


2021 ◽  
Vol 1 (7) ◽  
Author(s):  
Federica Quaglia ◽  
Tamas Lazar ◽  
András Hatos ◽  
Peter Tompa ◽  
Damiano Piovesan ◽  
...  

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Andrew T. McNutt ◽  
Paul Francoeur ◽  
Rishal Aggarwal ◽  
Tomohide Masuda ◽  
Rocco Meli ◽  
...  

AbstractMolecular docking computationally predicts the conformation of a small molecule when binding to a receptor. Scoring functions are a vital piece of any molecular docking pipeline as they determine the fitness of sampled poses. Here we describe and evaluate the 1.0 release of the Gnina docking software, which utilizes an ensemble of convolutional neural networks (CNNs) as a scoring function. We also explore an array of parameter values for Gnina 1.0 to optimize docking performance and computational cost. Docking performance, as evaluated by the percentage of targets where the top pose is better than 2Å root mean square deviation (Top1), is compared to AutoDock Vina scoring when utilizing explicitly defined binding pockets or whole protein docking. Gnina, utilizing a CNN scoring function to rescore the output poses, outperforms AutoDock Vina scoring on redocking and cross-docking tasks when the binding pocket is defined (Top1 increases from 58% to 73% and from 27% to 37%, respectively) and when the whole protein defines the binding pocket (Top1 increases from 31% to 38% and from 12% to 16%, respectively). The derived ensemble of CNNs generalizes to unseen proteins and ligands and produces scores that correlate well with the root mean square deviation to the known binding pose. We provide the 1.0 version of Gnina under an open source license for use as a molecular docking tool at https://github.com/gnina/gnina.


2018 ◽  
Vol 19 (11) ◽  
pp. 3315 ◽  
Author(s):  
Rita Pancsa ◽  
Fruzsina Zsolyomi ◽  
Peter Tompa

Although improved strategies for the detection and analysis of evolutionary couplings (ECs) between protein residues already enable the prediction of protein structures and interactions, they are mostly restricted to conserved and well-folded proteins. Whereas intrinsically disordered proteins (IDPs) are central to cellular interaction networks, due to the lack of strict structural constraints, they undergo faster evolutionary changes than folded domains. This makes the reliable identification and alignment of IDP homologs difficult, which led to IDPs being omitted in most large-scale residue co-variation analyses. By preforming a dedicated analysis of phylogenetically widespread bacterial IDP–partner interactions, here we demonstrate that partner binding imposes constraints on IDP sequences that manifest in detectable interprotein ECs. These ECs were not detected for interactions mediated by short motifs, rather for those with larger IDP–partner interfaces. Most identified coupled residue pairs reside close (<10 Å) to each other on the interface, with a third of them forming multiple direct atomic contacts. EC-carrying interfaces of IDPs are enriched in negatively charged residues, and the EC residues of both IDPs and partners preferentially reside in helices. Our analysis brings hope that IDP–partner interactions difficult to study could soon be successfully dissected through residue co-variation analysis.


2010 ◽  
Vol 88 (2) ◽  
pp. 269-290 ◽  
Author(s):  
Sarah Rauscher ◽  
Régis Pomès

Protein disorder is abundant in proteomes throughout all kingdoms of life and serves many biologically important roles. Disordered states of proteins are challenging to study experimentally due to their structural heterogeneity and tendency to aggregate. Computer simulations, which are not impeded by these properties, have recently emerged as a useful tool to characterize the conformational ensembles of intrinsically disordered proteins. In this review, we provide a survey of computational studies of protein disorder with an emphasis on the interdisciplinary nature of these studies. The application of simulation techniques to the study of disordered states is described in the context of experimental and bioinformatics approaches. Experimental data can be incorporated into simulations, and simulations can provide predictions for experiment. In this way, simulations have been integrated into the existing methodologies for the study of disordered state ensembles. We provide recent examples of simulations of disordered states from the literature and our own work. Throughout the review, we emphasize important predictions and biophysical understanding made possible through the use of simulations. This review is intended as both an overview and a guide for structural biologists and theoretical biophysicists seeking accurate, atomic-level descriptions of disordered state ensembles.


2020 ◽  
Vol 118 (3) ◽  
pp. 214a
Author(s):  
Saurabh Awasthi ◽  
Jared Houghtaling ◽  
Cuifeng Ying ◽  
Aziz Fennouri ◽  
Ivan Shorubalko ◽  
...  

2020 ◽  
Vol 221 (1) ◽  
pp. 651-664
Author(s):  
H Heydarizadeh Shali ◽  
D Sampietro ◽  
A Safari ◽  
M Capponi ◽  
A Bahroudi

SUMMARY The study of the discontinuity between crust and mantle beneath Iran is still an open issue in the geophysical community due to its various tectonic features created by the collision between the Iranian and Arabian Plate. For instance in regions such as Zagros, Alborz or Makran, despite the number of studies performed, both by exploiting gravity or seismic data, the depth of the Moho and also interior structure is still highly uncertain. This is due to the complexity of the crust and to the presence of large short wavelength signals in the Moho depth. GOCE observations are capable and useful products to describe the Earth’s crust structure either at the regional or global scale. Furthermore, it is plausible to retrieve important information regarding the structure of the Earth’s crust by combining the GOCE observations with seismic data and considering additional information. In the current study, we used as observation a grid of second radial derivative of the anomalous gravitational potential computed at an altitude of 221 km by means of the space-wise approach, to study the depth of the Moho. The observations have been reduced for the gravitational effects of topography, bathymetry and sediments. The residual gravity has been inverted accordingly to a simple two-layer model. In particular, this guarantees the uniqueness of the solution of the inverse problem which has been regularized by means of a collocation approach in the frequency domain. Although results of this study show a general good agreement with seismically derived depths with a root mean square deviation of 6 km, there are some discrepancies under the Alborz zone and also Oman sea with a root mean square deviation up 10 km for the former and an average difference of 3 km for the latter. Further comparisons with the natural feature of the study area, for instance, active faults, show that the resulting Moho features can be directly associated with geophysical and tectonic blocks.


Sign in / Sign up

Export Citation Format

Share Document