scholarly journals Computational Prediction of Intrinsically Disordered Proteins Based on Protein Sequences and Convolutional Neural Networks

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Hao He ◽  
Yong Yang

Intrinsically disordered proteins (IDPs) possess at least one region that lacks a single stable structure in vivo, which makes them play an important role in a variety of biological functions. We propose a prediction method for IDPs based on convolutional neural networks (CNNs) and feature selection. The combination of sequence and evolutionary properties is used to describe the differences between disordered and ordered regions. Especially, to highlight the correlation between the target residue and adjacent residues, multiple windows are selected to preprocess the protein sequence through the selected properties. The shorter windows reflect the characteristics of the central residue, and the longer windows reflect the characteristics of the surroundings around the central residue. Moreover, to highlight the specificity of sequence and evolutionary properties, they are preprocessed, respectively. After that, the preprocessed properties are combined into feature matrices as the input of the constructed CNN. Our method is training as well as testing based on the DisProt database. The simulation results show that the proposed method can predict IDPs effectively, and the performance is competitive in comparison with IsUnstruct and ESpritz.

2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Hao He ◽  
Yatong Zhou ◽  
Yue Chi ◽  
Jingfei He

Abstract Background Intrinsically disordered proteins possess flexible 3-D structures, which makes them play an important role in a variety of biological functions. Molecular recognition features (MoRFs) act as an important type of functional regions, which are located within longer intrinsically disordered regions and undergo disorder-to-order transitions upon binding their interaction partners. Results We develop a method, MoRFCNN, to predict MoRFs based on sequence properties and convolutional neural networks (CNNs). The sequence properties contain structural and physicochemical properties which are used to describe the differences between MoRFs and non-MoRFs. Especially, to highlight the correlation between the target residue and adjacent residues, three windows are selected to preprocess the selected properties. After that, these calculated properties are combined into the feature matrix to predict MoRFs through the constructed CNN. Comparing with other existing methods, MoRFCNN obtains better performance. Conclusions MoRFCNN is a new individual MoRFs prediction method which just uses protein sequence properties without evolutionary information. The simulation results show that MoRFCNN is effective and competitive.


eLife ◽  
2016 ◽  
Vol 5 ◽  
Author(s):  
Andrei Vovk ◽  
Chad Gu ◽  
Michael G Opferman ◽  
Larisa E Kapinos ◽  
Roderick YH Lim ◽  
...  

Nuclear Pore Complexes (NPCs) are key cellular transporter that control nucleocytoplasmic transport in eukaryotic cells, but its transport mechanism is still not understood. The centerpiece of NPC transport is the assembly of intrinsically disordered polypeptides, known as FG nucleoporins, lining its passageway. Their conformations and collective dynamics during transport are difficult to assess in vivo. In vitro investigations provide partially conflicting results, lending support to different models of transport, which invoke various conformational transitions of the FG nucleoporins induced by the cargo-carrying transport proteins. We show that the spatial organization of FG nucleoporin assemblies with the transport proteins can be understood within a first principles biophysical model with a minimal number of key physical variables, such as the average protein interaction strengths and spatial densities. These results address some of the outstanding controversies and suggest how molecularly divergent NPCs in different species can perform essentially the same function.


2021 ◽  
Vol 22 (19) ◽  
pp. 10677
Author(s):  
Huqiang Wang ◽  
Haolin Zhong ◽  
Chao Gao ◽  
Jiayin Zang ◽  
Dong Yang

The consecutive disordered regions (CDRs) are the basis for the formation of intrinsically disordered proteins, which contribute to various biological functions and increasing organism complexity. Previous studies have revealed that CDRs may be present inside or outside protein domains, but a comprehensive analysis of the property differences between these two types of CDRs and the proteins containing them is lacking. In this study, we investigated this issue from three viewpoints. Firstly, we found that in-domain CDRs are more hydrophilic and stable but have less stickiness and fewer post-translational modification sites compared with out-domain CDRs. Secondly, at the protein level, we found that proteins with only in-domain CDRs originated late, evolved rapidly, and had weak functional constraints, compared with the other two types of CDR-containing proteins. Proteins with only in-domain CDRs tend to be expressed spatiotemporal specifically, but they tend to have higher abundance and are more stable. Thirdly, we screened the CDR-containing protein domains that have a strong correlation with organism complexity. The CDR-containing domains tend to be evolutionarily young, or they changed from a domain without CDR to a CDR-containing domain during evolution. These results provide valuable new insights about the evolution and function of CDRs and protein domains.


Author(s):  
Evert Njomen ◽  
Theresa A. Lansdell ◽  
Allison Vanecek ◽  
Vanessa Benham ◽  
Matt P. Bernard ◽  
...  

SUMMARYEnhancing proteasome activity is a potential new therapeutic strategy to prevent the accumulation of aberrant high levels of protein that drive the pathogenesis of many diseases. Herein, we examine the use of small molecules to activate the 20S proteasome to reduce aberrant signaling by the undruggable oncoprotein c-MYC, to treat c-MYC driven oncogenesis. Overexpression of c-MYC is found in more than 50% of all human cancer but remains undruggable because of its highly dynamic intrinsically disordered 3-D conformation, which renders traditional therapeutic strategies largely ineffective. We demonstrate herein that small molecule activation of the 20S proteasome targets dysregulated intrinsically disordered proteins (IDPs), including c-MYC, and reduces cancer growth in vitro and in vivo models of multiple myeloma, and is even effective in bortezomib resistant cells and unresponsive patient samples. Genomic analysis of various cancer pathways showed that proteasome activation results in downregulation of many c-MYC target genes. Moreover, proteasome enhancement was well tolerated in mice and dogs. These data support the therapeutic potential of 20S proteasome activation in targeting IDP-driven proteotoxic disorders, including cancer, and demonstrate that this new therapeutic strategy is well tolerated in vivo.


2018 ◽  
Vol 9 (15) ◽  
pp. 3710-3715 ◽  
Author(s):  
Erica T. Prates ◽  
Xiaoyang Guan ◽  
Yaohao Li ◽  
Xinfeng Wang ◽  
Patrick K. Chaffey ◽  
...  

Protein glycosylation is a diverse post-translational modification that serves myriad biological functions.


2020 ◽  
Vol 117 (21) ◽  
pp. 11421-11431 ◽  
Author(s):  
Benjamin S. Schuster ◽  
Gregory L. Dignon ◽  
Wai Shing Tang ◽  
Fleurie M. Kelley ◽  
Aishwarya Kanchi Ranganath ◽  
...  

Phase separation of intrinsically disordered proteins (IDPs) commonly underlies the formation of membraneless organelles, which compartmentalize molecules intracellularly in the absence of a lipid membrane. Identifying the protein sequence features responsible for IDP phase separation is critical for understanding physiological roles and pathological consequences of biomolecular condensation, as well as for harnessing phase separation for applications in bioinspired materials design. To expand our knowledge of sequence determinants of IDP phase separation, we characterized variants of the intrinsically disordered RGG domain from LAF-1, a model protein involved in phase separation and a key component of P granules. Based on a predictive coarse-grained IDP model, we identified a region of the RGG domain that has high contact probability and is highly conserved between species; deletion of this region significantly disrupts phase separation in vitro and in vivo. We determined the effects of charge patterning on phase behavior through sequence shuffling. We designed sequences with significantly increased phase separation propensity by shuffling the wild-type sequence, which contains well-mixed charged residues, to increase charge segregation. This result indicates the natural sequence is under negative selection to moderate this mode of interaction. We measured the contributions of tyrosine and arginine residues to phase separation experimentally through mutagenesis studies and computationally through direct interrogation of different modes of interaction using all-atom simulations. Finally, we show that despite these sequence perturbations, the RGG-derived condensates remain liquid-like. Together, these studies advance our fundamental understanding of key biophysical principles and sequence features important to phase separation.


Cells ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1856
Author(s):  
Nikoletta Murvai ◽  
Lajos Kalmar ◽  
Bianka Szalaine Agoston ◽  
Beata Szabo ◽  
Agnes Tantos ◽  
...  

Details of the functional mechanisms of intrinsically disordered proteins (IDPs) in living cells is an area not frequently investigated. Here, we dissect the molecular mechanism of action of an IDP in cells by detailed structural analyses based on an in-cell nuclear magnetic resonance experiment. We show that the ID stress protein (IDSP) A. thaliana Early Response to Dehydration (ERD14) is capable of protecting E. coli cells under heat stress. The overexpression of ERD14 increases the viability of E. coli cells from 38.9% to 73.9% following heat stress (50 °C × 15 min). We also provide evidence that the protection is mainly achieved by protecting the proteome of the cells. In-cell NMR experiments performed in E. coli cells show that the protective activity is associated with a largely disordered structural state with conserved, short sequence motifs (K- and H-segments), which transiently sample helical conformations in vitro and engage in partner binding in vivo. Other regions of the protein, such as its S segment and its regions linking and flanking the binding motifs, remain unbound and disordered in the cell. Our data suggest that the cellular function of ERD14 is compatible with its residual structural disorder in vivo.


eLife ◽  
2017 ◽  
Vol 6 ◽  
Author(s):  
Jing Li ◽  
Jordan T White ◽  
Harry Saavedra ◽  
James O Wrabl ◽  
Hesam N Motlagh ◽  
...  

Intrinsically disordered proteins (IDPs) present a functional paradox because they lack stable tertiary structure, but nonetheless play a central role in signaling, utilizing a process known as allostery. Historically, allostery in structured proteins has been interpreted in terms of propagated structural changes that are induced by effector binding. Thus, it is not clear how IDPs, lacking such well-defined structures, can allosterically affect function. Here, we show a mechanism by which an IDP can allosterically control function by simultaneously tuning transcriptional activation and repression, using a novel strategy that relies on the principle of ‘energetic frustration’. We demonstrate that human glucocorticoid receptor tunes this signaling in vivo by producing translational isoforms differing only in the length of the disordered region, which modulates the degree of frustration. We expect this frustration-based model of allostery will prove to be generally important in explaining signaling in other IDPs.


2012 ◽  
Vol 20 (04) ◽  
pp. 471-511 ◽  
Author(s):  
MARK HOWELL ◽  
RYAN GREEN ◽  
ALEXIS KILLEEN ◽  
LAMAR WEDDERBURN ◽  
VINCENT PICASCIO ◽  
...  

Intrinsically disordered proteins or proteins with disordered regions are very common in nature. These proteins have numerous biological functions which are complementary to the biological activities of traditional ordered proteins. A noticeable difference in the amino acid sequences encoding long and short disordered regions was found and this difference was used in the development of length-dependent predictors of intrinsic disorder. In this study, we analyze the scaling of intrinsic disorder in eukaryotic proteins and investigate the presence of length-dependent functions attributed to proteins containing long disordered regions.


Sign in / Sign up

Export Citation Format

Share Document