Identify DisorderedRegions of Intrinsically Disordered Proteins by Multi-FeaturesFusion

2021 ◽  
Vol 16 ◽  
Author(s):  
Sun Can Zhuang ◽  
Feng Yonge

Background: Intrinsically disordered proteins lack a well-defined three-dimensional structure under physiological conditions. They have performed multiple functions in life activities and are closely related to many human diseases. The identification of the disordered region of intrinsically disordered proteins is important to protein functions annotation. Objective : Accurately identify the disordered regions in intrinsically disordered proteins. Method: In this article, we constructed a multi-feature fusion model based on support vector machine to predict disordered regions of intrinsically disordered proteins from the Disport database. We extracted codons usage frequencies, GC content, protein secondary structure components, hydrophilic-hydrophobic amino acidscomponents, and chemical shifts as features to predict the disordered regionsofintrinsically disordered proteins. Results : The best accuracy is 82.098% by using codons frequenciesin single feature prediction.In order to improve the performance, we fused these features and obtained the best result of 83.173%in combining codons frequencies with chemical shifts as the feature. Conclusion : The results show that our model has achieved a good prediction result in predicting disordered regions of intrinsically disordered proteins. Moreover, the performances of our modelare better than those of existing methods.

2020 ◽  
Vol 27 (4) ◽  
pp. 279-286 ◽  
Author(s):  
WeiXia Xie ◽  
Yong E. Feng

Background: Intrinsically disordered proteins lack a well-defined three dimensional structure under physiological conditions while possessing the essential biological functions. They take part in various physiological processes such as signal transduction, transcription and posttranslational modifications and etc. The disordered regions are the main functional sites for intrinsically disordered proteins. Therefore, the research of the disordered regions has become a hot issue. Objective: In this paper, our motivation is to analysis of the features of disordered regions with different molecular functions and predict of different disordered regions using valid features. Methods: In this article, according to the different molecular function, we firstly divided intrinsically disordered proteins into six classes in DisProt database. Then, we extracted four features using bioinformatics methods, namely, Amino Acid Index (AAIndex), codon frequency (Codon), three kinds of protein secondary structure compositions (3PSS) and Chemical Shifts (CSs), and used these features to predict the disordered regions of the different functions by Support Vector Machine (SVM). Results: The best overall accuracy was 99.29% using the chemical shift (CSs) as feature. In feature fusion, the overall accuracy can reach 88.70% by using CSs+AAIndex as features. The overall accuracy was up to 86.09% by using CSs+AAIndex+Codon+3PSS as features. Conclusion: We predicted and analyzed the disordered regions based on the molecular functions. The results showed that the prediction performance can be improved by adding chemical shifts and AAIndex as features, especially chemical shifts. Moreover, the chemical shift was the most effective feature in the prediction. We hoped that our results will be constructive for the study of intrinsically disordered proteins.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 230
Author(s):  
Mauricio Oberti ◽  
Iosif Vaisman

Intrinsically disordered proteins or intrinsically disordered regions (IDR) are segments within a protein chain lacking a stable three-dimensional structure under normal physiological conditions. Accurate prediction of IDRs is challenging due to their genome wide occurrence and low ratio of disordered residues, making them a difficult target for traditional classification techniques. Existing computational methods mostly rely on sequence profiles to improve accuracy, which is time consuming and computationally expensive. The shiny-pred application is an ab initio sequence-only disorder predictor implemented in R/Shiny language. In order to make predictions, it uses convolutional neural network models, trained using PDB sequence data. It can be installed on any operating system on which R can be installed and run locally. A public version of the web application can be accessed at https://gmu-binf.shinyapps.io/shiny-pred


2019 ◽  
Vol 116 (41) ◽  
pp. 20446-20452 ◽  
Author(s):  
Utsab R. Shrestha ◽  
Puneet Juneja ◽  
Qiu Zhang ◽  
Viswanathan Gurumoorthy ◽  
Jose M. Borreguero ◽  
...  

Intrinsically disordered proteins (IDPs) are abundant in eukaryotic proteomes, play a major role in cell signaling, and are associated with human diseases. To understand IDP function it is critical to determine their configurational ensemble, i.e., the collection of 3-dimensional structures they adopt, and this remains an immense challenge in structural biology. Attempts to determine this ensemble computationally have been hitherto hampered by the necessity of reweighting molecular dynamics (MD) results or biasing simulation in order to match ensemble-averaged experimental observables, operations that reduce the precision of the generated model because different structural ensembles may yield the same experimental observable. Here, by employing enhanced sampling MD we reproduce the experimental small-angle neutron and X-ray scattering profiles and the NMR chemical shifts of the disordered N terminal (SH4UD) of c-Src kinase without reweighting or constraining the simulations. The unbiased simulation results reveal a weakly funneled and rugged free energy landscape of SH4UD, which gives rise to a heterogeneous ensemble of structures that cannot be described by simple polymer theory. SH4UD adopts transient helices, which are found away from known phosphorylation sites and could play a key role in the stabilization of structural regions necessary for phosphorylation. Our findings indicate that adequately sampled molecular simulations can be performed to provide accurate physical models of flexible biosystems, thus rationalizing their biological function.


Author(s):  
Srinivas Ayyadevara ◽  
Akshatha Ganne ◽  
Meenakshisundaram Balasubramaniam ◽  
Robert J. Shmookler Reis

AbstractA protein’s structure is determined by its amino acid sequence and post-translational modifications, and provides the basis for its physiological functions. Across all organisms, roughly a third of the proteome comprises proteins that contain highly unstructured or intrinsically disordered regions. Proteins comprising or containing extensive unstructured regions are referred to as intrinsically disordered proteins (IDPs). IDPs are believed to participate in complex physiological processes through refolding of IDP regions, dependent on their binding to a diverse array of potential protein partners. They thus play critical roles in the assembly and function of protein complexes. Recent advances in experimental and computational analyses predicted multiple interacting partners for the disordered regions of proteins, implying critical roles in signal transduction and regulation of biological processes. Numerous disordered proteins are sequestered into aggregates in neurodegenerative diseases such as Alzheimer’s disease (AD) where they are enriched even in serum, making them good candidates for serum biomarkers to enable early detection of AD.


2021 ◽  
Vol 22 (19) ◽  
pp. 10677
Author(s):  
Huqiang Wang ◽  
Haolin Zhong ◽  
Chao Gao ◽  
Jiayin Zang ◽  
Dong Yang

The consecutive disordered regions (CDRs) are the basis for the formation of intrinsically disordered proteins, which contribute to various biological functions and increasing organism complexity. Previous studies have revealed that CDRs may be present inside or outside protein domains, but a comprehensive analysis of the property differences between these two types of CDRs and the proteins containing them is lacking. In this study, we investigated this issue from three viewpoints. Firstly, we found that in-domain CDRs are more hydrophilic and stable but have less stickiness and fewer post-translational modification sites compared with out-domain CDRs. Secondly, at the protein level, we found that proteins with only in-domain CDRs originated late, evolved rapidly, and had weak functional constraints, compared with the other two types of CDR-containing proteins. Proteins with only in-domain CDRs tend to be expressed spatiotemporal specifically, but they tend to have higher abundance and are more stable. Thirdly, we screened the CDR-containing protein domains that have a strong correlation with organism complexity. The CDR-containing domains tend to be evolutionarily young, or they changed from a domain without CDR to a CDR-containing domain during evolution. These results provide valuable new insights about the evolution and function of CDRs and protein domains.


2012 ◽  
Vol 40 (5) ◽  
pp. 955-962 ◽  
Author(s):  
Nathalie Sibille ◽  
Pau Bernadó

In recent years, IDPs (intrinsically disordered proteins) have emerged as pivotal actors in biology. Despite IDPs being present in all kingdoms of life, they are more abundant in eukaryotes where they are involved in the vast majority of regulation and signalling processes. The realization that, in some cases, functional states of proteins were partly or fully disordered was in contradiction to the traditional view where a well defined three-dimensional structure was required for activity. Several experimental evidences indicate, however, that structural features in IDPs such as transient secondary-structural elements and overall dimensions are crucial to their function. NMR has been the main tool to study IDP structure by probing conformational preferences at residue level. Additionally, SAXS (small-angle X-ray scattering) has the capacity to report on the three-dimensional space sampled by disordered states and therefore complements the local information provided by NMR. The present review describes how the synergy between NMR and SAXS can be exploited to obtain more detailed structural and dynamic models of IDPs in solution. These combined strategies, embedded into computational approaches, promise the elucidation of the structure–function properties of this important, but elusive, family of biomolecules.


2012 ◽  
Vol 40 (5) ◽  
pp. 995-999 ◽  
Author(s):  
Brigitte Gontero ◽  
Stephen C. Maberly

Many proteins contain disordered regions under physiological conditions and lack specific three-dimensional structure. These are referred to as IDPs (intrinsically disordered proteins). CP12 is a chloroplast protein of approximately 80 amino acids and has a molecular mass of approximately 8.2–8.5 kDa. It is enriched in charged amino acids and has a small number of hydrophobic residues. It has a high proportion of disorder-promoting residues, but has at least two (often four) cysteine residues forming one (or two) disulfide bridge(s) under oxidizing conditions that confers some order. However, CP12 behaves like an IDP. It appears to be universally distributed in oxygenic photosynthetic organisms and has recently been detected in a cyanophage. The best studied role of CP12 is its regulation of the Calvin cycle responsible for CO2 assimilation. Oxidized CP12 forms a supramolecular complex with two key Calvin cycle enzymes, GAPDH (glyceraldehyde-3-phosphate dehydrogenase) and PRK (phosphoribulokinase), down-regulating their activity. Association–dissociation of this complex, induced by the redox state of CP12, allows the Calvin cycle to be inactive in the dark and active in the light. CP12 is promiscuous and interacts with other enzymes such as aldolase and malate dehydrogenase. It also plays other roles in plant metabolism such as protecting GAPDH from inactivation and scavenging metal ions such as copper and nickel, and it is also linked to stress responses. Thus CP12 seems to be involved in many functions in photosynthetic cells and behaves like a jack of all trades as well as being a master of the Calvin cycle.


2019 ◽  
Vol 73 (12) ◽  
pp. 713-725 ◽  
Author(s):  
Ruth Hendus-Altenburger ◽  
Catarina B. Fernandes ◽  
Katrine Bugge ◽  
Micha B. A. Kunze ◽  
Wouter Boomsma ◽  
...  

Abstract Phosphorylation is one of the main regulators of cellular signaling typically occurring in flexible parts of folded proteins and in intrinsically disordered regions. It can have distinct effects on the chemical environment as well as on the structural properties near the modification site. Secondary chemical shift analysis is the main NMR method for detection of transiently formed secondary structure in intrinsically disordered proteins (IDPs) and the reliability of the analysis depends on an appropriate choice of random coil model. Random coil chemical shifts and sequence correction factors were previously determined for an Ac-QQXQQ-NH2-peptide series with X being any of the 20 common amino acids. However, a matching dataset on the phosphorylated states has so far only been incompletely determined or determined only at a single pH value. Here we extend the database by the addition of the random coil chemical shifts of the phosphorylated states of serine, threonine and tyrosine measured over a range of pH values covering the pKas of the phosphates and at several temperatures (www.bio.ku.dk/sbinlab/randomcoil). The combined results allow for accurate random coil chemical shift determination of phosphorylated regions at any pH and temperature, minimizing systematic biases of the secondary chemical shifts. Comparison of chemical shifts using random coil sets with and without inclusion of the phosphoryl group, revealed under/over estimations of helicity of up to 33%. The expanded set of random coil values will improve the reliability in detection and quantification of transient secondary structure in phosphorylation-modified IDPs.


Biomolecules ◽  
2019 ◽  
Vol 9 (2) ◽  
pp. 77 ◽  
Author(s):  
Xingcheng Lin ◽  
Prakash Kulkarni ◽  
Federico Bocci ◽  
Nicholas Schafer ◽  
Susmita Roy ◽  
...  

Folded proteins show a high degree of structural order and undergo (fairly constrained) collective motions related to their functions. On the other hand, intrinsically disordered proteins (IDPs), while lacking a well-defined three-dimensional structure, do exhibit some structural and dynamical ordering, but are less constrained in their motions than folded proteins. The larger structural plasticity of IDPs emphasizes the importance of entropically driven motions. Many IDPs undergo function-related disorder-to-order transitions driven by their interaction with specific binding partners. As experimental techniques become more sensitive and become better integrated with computational simulations, we are beginning to see how the modest structural ordering and large amplitude collective motions of IDPs endow them with an ability to mediate multiple interactions with different partners in the cell. To illustrate these points, here, we use Prostate-associated gene 4 (PAGE4), an IDP implicated in prostate cancer (PCa) as an example. We first review our previous efforts using molecular dynamics simulations based on atomistic AWSEM to study the conformational dynamics of PAGE4 and how its motions change in its different physiologically relevant phosphorylated forms. Our simulations quantitatively reproduced experimental observations and revealed how structural and dynamical ordering are encoded in the sequence of PAGE4 and can be modulated by different extents of phosphorylation by the kinases HIPK1 and CLK2. This ordering is reflected in changing populations of certain secondary structural elements as well as in the regularity of its collective motions. These ordered features are directly correlated with the functional interactions of WT-PAGE4, HIPK1-PAGE4 and CLK2-PAGE4 with the AP-1 signaling axis. These interactions give rise to repeated transitions between (high HIPK1-PAGE4, low CLK2-PAGE4) and (low HIPK1-PAGE4, high CLK2-PAGE4) cell phenotypes, which possess differing sensitivities to the standard PCa therapies, such as androgen deprivation therapy (ADT). We argue that, although the structural plasticity of an IDP is important in promoting promiscuous interactions, the modulation of the structural ordering is important for sculpting its interactions so as to rewire with agility biomolecular interaction networks with significant functional consequences.


Sign in / Sign up

Export Citation Format

Share Document