scholarly journals Exploring the Sequence Fitness Landscape of a Bridge Between Protein Folds

2020 ◽  
Author(s):  
Pengfei Tian ◽  
Robert B. Best

AbstractMost foldable protein sequences adopt only a single native fold. Recent protein design studies have, however, created protein sequences which fold into different structures apon changes of environment, or single point mutation, the best characterized example being the switch between the folds of the GA and GB binding domains of streptococcal protein G. To obtain further insight into the design of sequences which can switch folds, we have used a computational model for the fitness landscape of a single fold, built from the observed sequence variation of protein homologues. We have recently shown that such coevolutionary models can be used to design novel foldable sequences. By appropriately combining two of these models to describe the joint fitness landscape of GA and GB, we are able to describe the propensity of a given sequence for each of the two folds. We have successfully tested the combined model against the known series of designed GA/GB hybrids. Using Monte Carlo simulations on this landscape, we are able to identify pathways of mutations connecting the two folds. In the absence of a requirement for domain stability, the most frequent paths go via sequences in which neither domain is stably folded, reminiscent of the propensity for certain intrinsically disordered proteins to fold into different structures according to context. Even if the folded state is required to be stable, we find that there is nonetheless still a wide range of sequences which are close to the transition region and therefore likely fold switches, consistent with recent estimates that fold switching may be more widespread than had been thought.Author SummaryWhile most proteins self-assemble (or “fold”) to a unique three-dimensional structure, a few have been identified that can fold into two distinct structures. These so-called “metamorphic” proteins that can switch folds have attracted a lot of recent interest, and it has been suggested that they may be much more widespread than currently appreciated. We have developed a computational model that captures the propensity of a given protein sequence to fold into either one of two specific structures (GA and GB), in order to investigate which sequences are able to fold to both GA and GB (“switch sequences”), versus just one of them. Our model predicts that there is a large number of switch sequences that could fold into both structures, but also that the most likely such sequences are those for which the folded structures have low stability, in agreement with available experimental data. This also suggests that intrinsically disordered proteins which can fold into different structures on binding may provide an evolutionary path in sequence space between protein folds.

Molecules ◽  
2019 ◽  
Vol 24 (18) ◽  
pp. 3265 ◽  
Author(s):  
Vladimir N. Uversky

Cells are inhomogeneously crowded, possessing a wide range of intracellular liquid droplets abundantly present in the cytoplasm of eukaryotic and bacterial cells, in the mitochondrial matrix and nucleoplasm of eukaryotes, and in the chloroplast’s stroma of plant cells. These proteinaceous membrane-less organelles (PMLOs) not only represent a natural method of intracellular compartmentalization, which is crucial for successful execution of various biological functions, but also serve as important means for the processing of local information and rapid response to the fluctuations in environmental conditions. Since PMLOs, being complex macromolecular assemblages, possess many characteristic features of liquids, they represent highly dynamic (or fuzzy) protein–protein and/or protein–nucleic acid complexes. The biogenesis of PMLOs is controlled by specific intrinsically disordered proteins (IDPs) and hybrid proteins with ordered domains and intrinsically disordered protein regions (IDPRs), which, due to their highly dynamic structures and ability to facilitate multivalent interactions, serve as indispensable drivers of the biological liquid–liquid phase transitions (LLPTs) giving rise to PMLOs. In this article, the importance of the disorder-based supramolecular fuzziness for LLPTs and PMLO biogenesis is discussed.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Javier A. Iserte ◽  
Tamas Lazar ◽  
Silvio C. E. Tosatto ◽  
Peter Tompa ◽  
Cristina Marino-Buslje

Abstract Intrinsically disordered proteins/regions (IDPs/IDRs) are crucial components of the cell, they are highly abundant and participate ubiquitously in a wide range of biological functions, such as regulatory processes and cell signaling. Many of their important functions rely on protein interactions, by which they trigger or modulate different pathways. Sequence covariation, a powerful tool for protein contact prediction, has been applied successfully to predict protein structure and to identify protein–protein interactions mostly of globular proteins. IDPs/IDRs also mediate a plethora of protein–protein interactions, highlighting the importance of addressing sequence covariation-based inter-protein contact prediction of this class of proteins. Despite their importance, a systematic approach to analyze the covariation phenomena of intrinsically disordered proteins and their complexes is still missing. Here we carry out a comprehensive critical assessment of coevolution-based contact prediction in IDP/IDR complexes and detail the challenges and possible limitations that emerge from their analysis. We found that the coevolutionary signal is faint in most of the complexes of disordered proteins but positively correlates with the interface size and binding affinity between partners. In addition, we discuss the state-of-art methodology by biological interpretation of the results, formulate evaluation guidelines and suggest future directions of development to the field.


2019 ◽  
Vol 17 (01) ◽  
pp. 1950004 ◽  
Author(s):  
Chun Fang ◽  
Yoshitaka Moriwaki ◽  
Aikui Tian ◽  
Caihong Li ◽  
Kentaro Shimizu

Molecular recognition features (MoRFs) are key functional regions of intrinsically disordered proteins (IDPs), which play important roles in the molecular interaction network of cells and are implicated in many serious human diseases. Identifying MoRFs is essential for both functional studies of IDPs and drug design. This study adopts the cutting-edge machine learning method of artificial intelligence to develop a powerful model for improving MoRFs prediction. We proposed a method, named as en_DCNNMoRF (ensemble deep convolutional neural network-based MoRF predictor). It combines the outcomes of two independent deep convolutional neural network (DCNN) classifiers that take advantage of different features. The first, DCNNMoRF1, employs position-specific scoring matrix (PSSM) and 22 types of amino acid-related factors to describe protein sequences. The second, DCNNMoRF2, employs PSSM and 13 types of amino acid indexes to describe protein sequences. For both single classifiers, DCNN with a novel two-dimensional attention mechanism was adopted, and an average strategy was added to further process the output probabilities of each DCNN model. Finally, en_DCNNMoRF combined the two models by averaging their final scores. When compared with other well-known tools applied to the same datasets, the accuracy of the novel proposed method was comparable with that of state-of-the-art methods. The related web server can be accessed freely via http://vivace.bi.a.u-tokyo.ac.jp:8008/fang/en_MoRFs.php .


2018 ◽  
Vol 201 (2) ◽  
Author(s):  
Tamiko Oguri ◽  
Youjeong Kwon ◽  
Jerry K. K. Woo ◽  
Gerd Prehna ◽  
Hyun Lee ◽  
...  

ABSTRACTBy screening a collection ofSalmonellamutants deleted for genes encoding small proteins of ≤60 amino acids, we identified three paralogous small genes (ymdF,STM14_1829, andyciG) required for wild-type flagellum-dependent swimming and swarming motility. TheymdF,STM14_1829, andyciGgenes encode small proteins of 55, 60, and 60 amino acid residues, respectively. A bioinformatics analysis predicted that these small proteins are intrinsically disordered proteins, and circular dichroism analysis of purified recombinant proteins confirmed that all three proteins are unstructured in solution. A mutant deleted for STM14_1829 showed the most severe motility defect, indicating that among the three paralogs, STM14_1829 is a key protein required for wild-type motility. We determined that relative to the wild type, the expression of the flagellin protein FliC is lower in the ΔSTM14_1829mutant due to the downregulation of theflhDCoperon encoding the FlhDC master regulator. By comparing the gene expression profiles between the wild-type and ΔSTM14_1829strains via RNA sequencing, we found that the gene encoding the response regulator PhoP is upregulated in the ΔSTM14_1829mutant, suggesting the indirect repression of theflhDCoperon by the activated PhoP. Homologs of STM14_1829 are conserved in a wide range of bacteria, includingEscherichia coliandPseudomonas aeruginosa. We showed that the inactivation of STM14_1829 homologs inE. coliandP. aeruginosaalso alters motility, suggesting that this family of small intrinsically disordered proteins may play a role in the cellular pathway(s) that affects motility.IMPORTANCEThis study reports the identification of a novel family of small intrinsically disordered proteins that are conserved in a wide range of flagellated and nonflagellated bacteria. Although this study identifies the role of these small proteins in the scope of flagellum-dependent motility inSalmonella, they likely play larger roles in a more conserved cellular pathway(s) that indirectly affects flagellum expression in the case of motile bacteria. Small intrinsically disordered proteins have not been well characterized in prokaryotes, and the results of our study provide a basis for their detailed functional characterization.


2011 ◽  
Vol 44 (4) ◽  
pp. 467-518 ◽  
Author(s):  
H. Jane Dyson

AbstractProteins provide much of the scaffolding for life, as well as undertaking a variety of essential catalytic reactions. These characteristic functions have led us to presuppose that proteins are in general functional only when well structured and correctly folded. As we begin to explore the repertoire of possible protein sequences inherent in the human and other genomes, two stark facts that belie this supposition become clear: firstly, the number of apparent open reading frames in the human genome is significantly smaller than appears to be necessary to code for all of the diverse proteins in higher organisms, and secondly that a significant proportion of the protein sequences that would be coded by the genome would not be expected to form stable three-dimensional (3D) structures. Clearly the genome must include coding for a multitude of alternative forms of proteins, some of which may be partly or fully disordered or incompletely structured in their functional states. At the same time as this likelihood was recognized, experimental studies also began to uncover examples of important protein molecules and domains that were incompletely structured or completely disordered in solution, yet remained perfectly functional. In the ensuing years, we have seen an explosion of experimental and genome-annotation studies that have mapped the extent of the intrinsic disorder phenomenon and explored the possible biological rationales for its widespread occurrence. Answers to the question ‘why would a particular domain need to be unstructured?’ are as varied as the systems where such domains are found. This review provides a survey of recent new directions in this field, and includes an evaluation of the role not only of intrinsically disordered proteins but also of partially structured and highly dynamic members of the disorder–order continuum.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Sarah E. Bondos ◽  
A. Keith Dunker ◽  
Vladimir N. Uversky

AbstractFor proteins, the sequence → structure → function paradigm applies primarily to enzymes, transmembrane proteins, and signaling domains. This paradigm is not universal, but rather, in addition to structured proteins, intrinsically disordered proteins and regions (IDPs and IDRs) also carry out crucial biological functions. For these proteins, the sequence → IDP/IDR ensemble → function paradigm applies primarily to signaling and regulatory proteins and regions. Often, in order to carry out function, IDPs or IDRs cooperatively interact, either intra- or inter-molecularly, with structured proteins or other IDPs or intermolecularly with nucleic acids. In this IDP/IDR thematic collection published in Cell Communication and Signaling, thirteen articles are presented that describe IDP/IDR signaling molecules from a variety of organisms from humans to fruit flies and tardigrades (“water bears”) and that describe how these proteins and regions contribute to the function and regulation of cell signaling. Collectively, these papers exhibit the diverse roles of disorder in responding to a wide range of signals as to orchestrate an array of organismal processes. They also show that disorder contributes to signaling in a broad spectrum of species, ranging from micro-organisms to plants and animals.


2020 ◽  
Vol 295 (15) ◽  
pp. 4912-4922 ◽  
Author(s):  
Patrick N. Reardon ◽  
Kayla A. Jara ◽  
Amber D. Rolland ◽  
Delaney A. Smith ◽  
Hanh T. M. Hoang ◽  
...  

Dynein light chain 8 (LC8) interacts with intrinsically disordered proteins (IDPs) and influences a wide range of biological processes. It is becoming apparent that among the numerous IDPs that interact with LC8, many contain multiple LC8-binding sites. Although it is established that LC8 forms parallel IDP duplexes with some partners, such as nucleoporin Nup159 and dynein intermediate chain, the molecular details of these interactions and LC8's interactions with other diverse partners remain largely uncharacterized. LC8 dimers could bind in either a paired “in-register” or a heterogeneous off-register manner to any of the available sites on a multivalent partner. Here, using NMR chemical shift perturbation, analytical ultracentrifugation, and native electrospray ionization MS, we show that LC8 forms a predominantly in-register complex when bound to an IDP domain of the multivalent regulatory protein ASCIZ. Using saturation transfer difference NMR, we demonstrate that at substoichiometric LC8 concentrations, the IDP domain preferentially binds to one of the three LC8 recognition motifs. Further, the differential dynamic behavior for the three sites and the size of the fully bound complex confirmed an in-register complex. Dynamics measurements also revealed that coupling between sites depends on the linker length separating these sites. These results identify linker length and motif specificity as drivers of in-register binding in the multivalent LC8–IDP complex assembly and the degree of compositional and conformational heterogeneity as a promising emerging mechanism for tuning of binding and regulation.


Algorithms ◽  
2019 ◽  
Vol 12 (2) ◽  
pp. 46 ◽  
Author(s):  
Hao He ◽  
Jiaxiang Zhao ◽  
Guiling Sun

Intrinsically disordered proteins perform a variety of important biological functions, which makes their accurate prediction useful for a wide range of applications. We develop a scheme for predicting intrinsically disordered proteins by employing 35 features including eight structural properties, seven physicochemical properties and 20 pieces of evolutionary information. In particular, the scheme includes a preprocessing procedure which greatly reduces the input features. Using two different windows, the preprocessed data containing not only the properties of the surroundings of the target residue but also the properties related to the specific target residue are fed into a multi-layer perceptron neural network as its inputs. The Adam algorithm for the back propagation together with the dropout algorithm to avoid overfitting are introduced during the training process. The training as well as testing our procedure is performed on the dataset DIS803 from a DisProt database. The simulation results show that the performance of our scheme is competitive in comparison with ESpritz and IsUnstruct.


2020 ◽  
Vol 48 (10) ◽  
pp. 5318-5331 ◽  
Author(s):  
Akshay Sridhar ◽  
Modesto Orozco ◽  
Rosana Collepardo-Guevara

Abstract Intrinsically disordered proteins are crucial elements of chromatin heterogenous organization. While disorder in the histone tails enables a large variation of inter-nucleosome arrangements, disorder within the chromatin-binding proteins facilitates promiscuous binding to a wide range of different molecular targets, consistent with structural heterogeneity. Among the partially disordered chromatin-binding proteins, the H1 linker histone influences a myriad of chromatin characteristics including compaction, nucleosome spacing, transcription regulation, and the recruitment of other chromatin regulating proteins. Although it is now established that the long C-terminal domain (CTD) of H1 remains disordered upon nucleosome binding and that such disorder favours chromatin fluidity, the structural behaviour and thereby the role/function of the N-terminal domain (NTD) within chromatin is yet unresolved. On the basis of microsecond-long parallel-tempering metadynamics and temperature-replica exchange atomistic molecular dynamics simulations of different H1 NTD subtypes, we demonstrate that the NTD is completely unstructured in solution but undergoes an important disorder-to-order transition upon nucleosome binding: it forms a helix that enhances its DNA binding ability. Further, we show that the helical propensity of the H1 NTD is subtype-dependent and correlates with the experimentally observed binding affinity of H1 subtypes, suggesting an important functional implication of this disorder-to-order transition.


2017 ◽  
Author(s):  
Sankar Basu ◽  
Parbati Biswas

AbstractIntrinsically Disordered Proteins (IDPs) are enriched in charged and polar residues; and, therefore, electrostatic interactions play a predominant role in their dynamics. In order to remain multi-functional and exhibit their characteristic binding promiscuity, they need to retain considerable dynamic flexibility. At the same time, they also need to accommodate a large number of oppositely charged residues, which eventually lead to the formation of salt-bridges, imparting local rigidity. The formation of salt-bridges therefore oppose the desired dynamic flexibility. Hence, there appears to be a meticulous trade-off between the two mechanisms which the current study attempts to unravel. With this objective, we identify and analyze salt-bridges, both as isolated as well as composite ionic bond motifs, in the molecular dynamic trajectories of a set of appropriately chosen IDPs. Time evolved structural properties of these salt-bridges like persistence, associated secondary structural ′order-disorder′ transitions, correlated atomic movements, contribution in the overall electrostatic balance of the proteins have been studied in necessary detail. The results suggest that the key to maintain such a trade-off over time is the continuous formation and dissolution of salt-bridges with a wide range of persistence. Also, the continuous dynamic interchange of charged-atom-pairs (coming from a variety of oppositely charged side-chains) in the transient ionic bonds supports a model of dynamic flexibility concomitant with the well characterized stochastic conformational switching in these proteins. The results and conclusions should facilitate the future design of salt-bridges as a mean to further explore the disordered-globular interface in proteins.


Sign in / Sign up

Export Citation Format

Share Document