Critical Assessment of Protein Intrinsic Disorder Prediction

AbstractIntrinsically disordered proteins defying the traditional protein structure-function paradigm represent a challenge to study experimentally. As a large part of our knowledge rests on computational predictions, it is crucial for their accuracy to be high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in predicting intrinsically disordered regions in proteins and the subset of disordered residues involved in binding other molecules. A total of 43 methods, 32 for disorder and 11 for binding regions, were evaluated on a dataset of 646 novel manually curated proteins from DisProt. The best methods use deep learning techniques and significantly outperform widely used earlier physicochemical methods across different types of targets. Disordered binding regions remain hard to predict correctly. Depending on the definition used, the top disorder predictor has an FMax of 0.483 (DisProt) or 0.792 (DisProt-PDB). As the top binding predictor only attains an FMax of 0.231, this suggests significant potential for improvement. Intriguingly, computing times among the top performing methods vary by up to four orders of magnitude.

Download Full-text

Critical assessment of protein intrinsic disorder prediction

Nature Methods ◽

10.1038/s41592-021-01117-3 ◽

2021 ◽

Cited By ~ 2

Author(s):

Marco Necci ◽

◽

Damiano Piovesan ◽

Silvio C. E. Tosatto ◽

Keyword(s):

Intrinsically Disordered Proteins ◽

Intrinsic Disorder ◽

Critical Assessment ◽

Disordered Proteins ◽

Blind Test ◽

Disorder Prediction ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Full Dataset ◽

Protein Intrinsic Disorder

AbstractIntrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude.

Download Full-text

Assemblages: Functional units formed by cellular phase separation

The Journal of Cell Biology ◽

10.1083/jcb.201404124 ◽

2014 ◽

Vol 206 (5) ◽

pp. 579-588 ◽

Cited By ~ 175

Author(s):

Jeffrey A. Toretsky ◽

Peter E. Wright

Keyword(s):

Phase Separation ◽

Intrinsically Disordered Proteins ◽

Intrinsic Disorder ◽

Low Complexity ◽

Disordered Proteins ◽

Intracellular Space ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Disease States ◽

Membrane Bound

The partitioning of intracellular space beyond membrane-bound organelles can be achieved with collections of proteins that are multivalent or contain low-complexity, intrinsically disordered regions. These proteins can undergo a physical phase change to form functional granules or other entities within the cytoplasm or nucleoplasm that collectively we term “assemblage.” Intrinsically disordered proteins (IDPs) play an important role in forming a subset of cellular assemblages by promoting phase separation. Recent work points to an involvement of assemblages in disease states, indicating that intrinsic disorder and phase transitions should be considered in the development of therapeutics.

Download Full-text

Quantitative proteome-based guidelines for intrinsic disorder characterization

10.1101/032847 ◽

2015 ◽

Author(s):

Michael Vincent ◽

Mark Whidden ◽

Santiago Schnell

Keyword(s):

Intrinsically Disordered Proteins ◽

Objective Assessment ◽

Intrinsic Disorder ◽

Disordered Proteins ◽

Dimensional Structure ◽

Sequence Length ◽

Valuable Insight ◽

Disorder Prediction ◽

Intrinsically Disordered ◽

Prediction Algorithms

AbstractIntrinsically disordered proteins fail to adopt a stable three-dimensional structure under physiological conditions. It is now understood that many disordered proteins are not dysfunctional, but instead engage in numerous cellular processes, including signaling and regulation. Disorder characterization from amino acid sequence relies on computational disorder prediction algorithms. While numerous large-scale investigations of disorder have been performed using these algorithms, and have offered valuable insight regarding the prevalence of protein disorder in many organisms, critical proteome-based descriptive statistical guidelines that would enable the objective assessment of intrinsic disorder in a protein of interest remain to be established. Here we present a quantitative characterization of numerous disorder features using a rigorous non-parametric statistical approach, providing expected values and percentile cutoffs for each feature in ten eukaryotic proteomes. Our estimates utilize multiple ab initio disorder prediction algorithms grounded on physicochemical principles. Furthermore, we present novel threshold values, specific to both the prediction algorithms and the proteomes, defining the longest primary sequence length in which the significance of a continuous disordered region can be evaluated on the basis of length alone. The guidelines presented here are intended to improve the interpretation of disorder content and continuous disorder predictions from the proteomic point of view.

Download Full-text

flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions

Nature Communications ◽

10.1038/s41467-021-24773-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Gang Hu ◽

Akila Katuwawala ◽

Kui Wang ◽

Zhonghua Wu ◽

Sina Ghadermarzi ◽

...

Keyword(s):

Predictive Performance ◽

Intrinsic Disorder ◽

Critical Assessment ◽

Disordered Proteins ◽

Computational Tool ◽

Cellular Functions ◽

Disorder Prediction ◽

Protein Intrinsic Disorder ◽

Sequence Profiles ◽

Better Than

AbstractIdentification of intrinsic disorder in proteins relies in large part on computational predictors, which demands that their accuracy should be high. Since intrinsic disorder carries out a broad range of cellular functions, it is desirable to couple the disorder and disorder function predictions. We report a computational tool, flDPnn, that provides accurate, fast and comprehensive disorder and disorder function predictions from protein sequences. The recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment and results on other test datasets demonstrate that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions. These predictions are substantially better than the results of the existing disorder predictors and methods that predict functions of disorder. Ablation tests reveal that the high predictive performance stems from innovative ways used in flDPnn to derive sequence profiles and encode inputs. flDPnn’s webserver is available at http://biomine.cs.vcu.edu/servers/flDPnn/

Download Full-text

A Suggestion of Converting Protein Intrinsic Disorder to Structural Entropy Using Shannon’s Information Theory

Entropy ◽

10.3390/e21060591 ◽

2019 ◽

Vol 21 (6) ◽

pp. 591

Author(s):

Hao-Bo Guo ◽

Yue Ma ◽

Gerald Tuskan ◽

Hong Qin ◽

Xiaohan Yang ◽

...

Keyword(s):

Information Theory ◽

Intrinsically Disordered Proteins ◽

Structural Information ◽

Intrinsic Disorder ◽

High Energy ◽

Disordered Proteins ◽

Sequence Length ◽

Intrinsically Disordered ◽

Structural Entropy ◽

Protein Intrinsic Disorder

We propose a framework to convert the protein intrinsic disorder content to structural entropy (H) using Shannon’s information theory (IT). The structural capacity (C), which is the sum of H and structural information (I), is equal to the amino acid sequence length of the protein. The structural entropy of the residues expands a continuous spectrum, ranging from 0 (fully ordered) to 1 (fully disordered), consistent with Shannon’s IT, which scores the fully-determined state 0 and the fully-uncertain state 1. The intrinsically disordered proteins (IDPs) in a living cell may participate in maintaining the high-energy-low-entropy state. In addition, under this framework, the biological functions performed by proteins and associated with the order or disorder of their 3D structures could be explained in terms of information-gains or entropy-losses, or the reverse processes.

Download Full-text

New classification of intrinsic disorder in the Human proteome

10.1101/446351 ◽

2018 ◽

Author(s):

Antonio Deiana ◽

Sergio Forcelloni ◽

Alessandro Porrello ◽

Andrea Giansanti

Keyword(s):

Intrinsically Disordered Proteins ◽

Chemical Properties ◽

Intrinsic Disorder ◽

Disordered Proteins ◽

General Category ◽

Nucleic Acid Binding ◽

Functional Protein ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions

ABSTRACTWe propose a new, sequence-only, classification of intrinsically disordered human proteins which is based on two parameters: dr, the percentage of disordered residues, and Ld, the length of the longest disordered segment in the sequence. Depending on dr and Ld, we distinguish five variants: i) ordered proteins (ORDs); ii) not disordered proteins (NDPsj; (iii) proteins with intrinsically disordered regions (PDRs); iv) intrinsically disordered proteins (IDPs) and v) proteins with fragmented disorder (FRAGs). PDRs have been considered in the general category of intrinsically disordered proteins for a long time. We show that PDRs are closer to globular, ordered proteins (ORDs and NDPs) than to disordered ones (IDPs), both in amino acid composition and functionally. Moreover, NDPs and PDRs are uniformly spread over several functional protein classes, whereas IDPs are concentrated only on two, namely nucleic acid binding proteins and transcription factors, which are just a subset of the functions that are commonly associated with protein intrinsic disorder. As a conclusion, PDRs and IDPs should be considered, in future classifications, as distinct variants of disordered proteins, with different physical-chemical properties and functional spectra.

Download Full-text

Intrinsically disordered proteins in overcrowded milieu: Membrane-less organelles, phase separation, and intrinsic disorder

Current Opinion in Structural Biology ◽

10.1016/j.sbi.2016.10.015 ◽

2017 ◽

Vol 44 ◽

pp. 18-30 ◽

Cited By ~ 222

Author(s):

Vladimir N Uversky

Keyword(s):

Phase Separation ◽

Intrinsically Disordered Proteins ◽

Intrinsic Disorder ◽

Disordered Proteins ◽

Intrinsically Disordered

Download Full-text

Intrinsically disordered proteins identified in the aggregate proteome serve as biomarkers of neurodegeneration

Metabolic Brain Disease ◽

10.1007/s11011-021-00791-8 ◽

2021 ◽

Author(s):

Srinivas Ayyadevara ◽

Akshatha Ganne ◽

Meenakshisundaram Balasubramaniam ◽

Robert J. Shmookler Reis

Keyword(s):

Intrinsically Disordered Proteins ◽

Protein Complexes ◽

Disordered Proteins ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Physiological Processes ◽

Protein Partners ◽

And Function ◽

Computational Analyses ◽

Disordered Regions

AbstractA protein’s structure is determined by its amino acid sequence and post-translational modifications, and provides the basis for its physiological functions. Across all organisms, roughly a third of the proteome comprises proteins that contain highly unstructured or intrinsically disordered regions. Proteins comprising or containing extensive unstructured regions are referred to as intrinsically disordered proteins (IDPs). IDPs are believed to participate in complex physiological processes through refolding of IDP regions, dependent on their binding to a diverse array of potential protein partners. They thus play critical roles in the assembly and function of protein complexes. Recent advances in experimental and computational analyses predicted multiple interacting partners for the disordered regions of proteins, implying critical roles in signal transduction and regulation of biological processes. Numerous disordered proteins are sequestered into aggregates in neurodegenerative diseases such as Alzheimer’s disease (AD) where they are enriched even in serum, making them good candidates for serum biomarkers to enable early detection of AD.

Download Full-text

Random coil chemical shifts for serine, threonine and tyrosine phosphorylation over a broad pH range

Journal of Biomolecular NMR ◽

10.1007/s10858-019-00283-z ◽

2019 ◽

Vol 73 (12) ◽

pp. 713-725 ◽

Cited By ~ 4

Author(s):

Ruth Hendus-Altenburger ◽

Catarina B. Fernandes ◽

Katrine Bugge ◽

Micha B. A. Kunze ◽

Wouter Boomsma ◽

...

Keyword(s):

Chemical Shift ◽

Secondary Structure ◽

Intrinsically Disordered Proteins ◽

Chemical Shifts ◽

Random Coil ◽

Disordered Proteins ◽

Phosphoryl Group ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Secondary Chemical

Abstract Phosphorylation is one of the main regulators of cellular signaling typically occurring in flexible parts of folded proteins and in intrinsically disordered regions. It can have distinct effects on the chemical environment as well as on the structural properties near the modification site. Secondary chemical shift analysis is the main NMR method for detection of transiently formed secondary structure in intrinsically disordered proteins (IDPs) and the reliability of the analysis depends on an appropriate choice of random coil model. Random coil chemical shifts and sequence correction factors were previously determined for an Ac-QQXQQ-NH2-peptide series with X being any of the 20 common amino acids. However, a matching dataset on the phosphorylated states has so far only been incompletely determined or determined only at a single pH value. Here we extend the database by the addition of the random coil chemical shifts of the phosphorylated states of serine, threonine and tyrosine measured over a range of pH values covering the pKas of the phosphates and at several temperatures (www.bio.ku.dk/sbinlab/randomcoil). The combined results allow for accurate random coil chemical shift determination of phosphorylated regions at any pH and temperature, minimizing systematic biases of the secondary chemical shifts. Comparison of chemical shifts using random coil sets with and without inclusion of the phosphoryl group, revealed under/over estimations of helicity of up to 33%. The expanded set of random coil values will improve the reliability in detection and quantification of transient secondary structure in phosphorylation-modified IDPs.

Download Full-text

Optimization of Molecular Dynamics Simulations of c-MYC1-88—An Intrinsically Disordered System

Life ◽

10.3390/life10070109 ◽

2020 ◽

Vol 10 (7) ◽

pp. 109 ◽

Cited By ~ 2

Author(s):

Sandra S. Sullivan ◽

Robert O.J. Weinzierl

Keyword(s):

Molecular Dynamics ◽

Molecular Dynamics Simulations ◽

Intrinsically Disordered Proteins ◽

Disordered Proteins ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Proliferation And Apoptosis ◽

Conventional Structure ◽

Experimental Approaches ◽

Dynamics Simulations

Many of the proteins involved in key cellular regulatory events contain extensive intrinsically disordered regions that are not readily amenable to conventional structure/function dissection. The oncoprotein c-MYC plays a key role in controlling cell proliferation and apoptosis and more than 70% of the primary sequence is disordered. Computational approaches that shed light on the range of secondary and tertiary structural conformations therefore provide the only realistic chance to study such proteins. Here, we describe the results of several tests of force fields and water models employed in molecular dynamics simulations for the N-terminal 88 amino acids of c-MYC. Comparisons of the simulation data with experimental secondary structure assignments obtained by NMR establish a particular implicit solvation approach as highly congruent. The results provide insights into the structural dynamics of c-MYC1-88, which will be useful for guiding future experimental approaches. The protocols for trajectory analysis described here will be applicable for the analysis of a variety of computational simulations of intrinsically disordered proteins.

Download Full-text