scholarly journals Fold and flexibility: what can proteins' mechanical properties tell us about their folding nucleus?

2015 ◽  
Vol 12 (112) ◽  
pp. 20150876 ◽  
Author(s):  
Sophie Sacquin-Mora

The determination of a protein's folding nucleus, i.e. a set of native contacts playing an important role during its folding process, remains an elusive yet essential problem in biochemistry. In this work, we investigate the mechanical properties of 70 protein structures belonging to 14 protein families presenting various folds using coarse-grain Brownian dynamics simulations. The resulting rigidity profiles combined with multiple sequence alignments show that a limited set of rigid residues, which we call the consensus nucleus, occupy conserved positions along the protein sequence. These residues' side chains form a tight interaction network within the protein's core, thus making our consensus nuclei potential folding nuclei. A review of experimental and theoretical literature shows that most (above 80%) of these residues were indeed identified as folding nucleus member in earlier studies.

2021 ◽  
Author(s):  
Jaspreet Singh ◽  
Kuldip Paliwal ◽  
Jaswinder Singh ◽  
Yaoqi Zhou

Protein language models have emerged as an alternative to multiple sequence alignment for enriching sequence information and improving downstream prediction tasks such as biophysical, structural, and functional properties. Here we show that a combination of traditional one-hot encoding with the embeddings from two different language models (ProtTrans and ESM-1b) allows a leap in accuracy over single-sequence based techniques in predicting protein 1D secondary and tertiary structural properties, including backbone torsion angles, solvent accessibility and contact numbers. This large improvement leads to an accuracy comparable to or better than the current state-of-the-art techniques for predicting these 1D structural properties based on sequence profiles generated from multiple sequence alignments. The high-accuracy prediction in both secondary and tertiary structural properties indicates that it is possible to make highly accurate prediction of protein structures without homologous sequences, the remaining obstacle in the post AlphaFold2 era.


Soft Matter ◽  
2015 ◽  
Vol 11 (40) ◽  
pp. 7877-7887 ◽  
Author(s):  
Navid Kazem ◽  
Carmel Majidi ◽  
Craig E. Maloney

We perform Brownian dynamics simulations to study the gelation of suspensions of attractive, rod-like particles. We show that if the attraction is sufficiently corrugated or patchy, over time, a rigid space-spanning network will form. Surprisingly, the structural and mechanical properties are non-monotonic in the fraction of the surface.


2021 ◽  
Author(s):  
Allan Costa ◽  
Manvitha Ponnapati ◽  
Joseph M Jacobson ◽  
Pranam Chatterjee

Determining the structure of proteins has been a long-standing goal in biology. Language models have been recently deployed to capture the evolutionary semantics of protein sequences. Enriched with multiple sequence alignments (MSA), these models can encode protein tertiary structure. In this work, we introduce an attention-based graph architecture that exploits MSA Transformer embeddings to directly produce three-dimensional folded structures from protein sequences. We envision that this pipeline will provide a basis for efficient, end-to-end protein structure prediction.


2019 ◽  
Vol 17 (02) ◽  
pp. 1950006 ◽  
Author(s):  
Ashish Runthala ◽  
Shibasish Chowdhury

In contrast to ab-initio protein modeling methodologies, comparative modeling is considered as the most popular and reliable algorithm to model protein structure. However, the selection of the best set of templates is still a major challenge. An effective template-ranking algorithm is developed to efficiently select only the reliable hits for predicting the protein structures. The algorithm employs the pairwise as well as multiple sequence alignments of template hits to rank and select the best possible set of templates. It captures several key sequences and structural information of template hits and converts into scores to effectively rank them. This selected set of templates is used to model a target. Modeling accuracy of the algorithm is tested and evaluated on TBM-HA domain containing CASP8, CASP9 and CASP10 targets. On an average, this template ranking and selection algorithm improves GDT-TS, GDT-HA and TM_Score by 3.531, 4.814 and 0.022, respectively. Further, it has been shown that the inclusion of structurally similar templates with ample conformational diversity is crucial for the modeling algorithm to maximally as well as reliably span the target sequence and construct its near-native model. The optimal model sampling also holds the key to predict the best possible target structure.


2006 ◽  
Vol 46 (1) ◽  
pp. 95-109 ◽  
Author(s):  
Julian Oberdisse ◽  
Giovanni Ianniruberto ◽  
Francesco Greco ◽  
Giuseppe Marrucci

2008 ◽  
Vol 2 ◽  
pp. BBI.S426 ◽  
Author(s):  
Guillaume Fourty ◽  
Isabelle Callebaut ◽  
Jean-Paul Mornon

Prediction of key features of protein structures, such as secondary structure, solvent accessibility and number of contacts between residues, provides useful structural constraints for comparative modeling, fold recognition, ab-initio fold prediction and detection of remote relationships. In this study, we aim at characterizing the number of non-trivial close neighbors, or long-range contacts of a residue, as a function of its “topohydrophobic” index deduced from multiple sequence alignments and of the secondary structure in which it is embedded. The “topohydrophobic” index is calculated using a two-class distribution of amino acids, based on their mean atom depths. From a large set of structural alignments processed from the FSSP database, we selected 1485 structural sub-families including at least 8 members, with accurate alignments and limited redundancy. We show that residues within helices, even when deeply buried, have few non-trivial neighbors (0–2), whereas β-strand residues clearly exhibit a multimodal behavior, dominated by the local geometry of the tetrahedron (3 non-trivial close neighbors associated with one tetrahedron; 6 with two tetrahedra). This observed behavior allows the distinction, from sequence profiles, between edge and central β-strands within β-sheets. Useful topological constraints on the immediate neighborhood of an amino acid, but also on its correlated solvent accessibility, can thus be derived using this approach, from the simple knowledge of multiple sequence alignments.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Elena N. Judd ◽  
Alison R. Gilchrist ◽  
Nicholas R. Meyerson ◽  
Sara L. Sawyer

Abstract Background The Type I interferon response is an important first-line defense against viruses. In turn, viruses antagonize (i.e., degrade, mis-localize, etc.) many proteins in interferon pathways. Thus, hosts and viruses are locked in an evolutionary arms race for dominance of the Type I interferon pathway. As a result, many genes in interferon pathways have experienced positive natural selection in favor of new allelic forms that can better recognize viruses or escape viral antagonists. Here, we performed a holistic analysis of selective pressures acting on genes in the Type I interferon family. We initially hypothesized that the genes responsible for inducing the production of interferon would be antagonized more heavily by viruses than genes that are turned on as a result of interferon. Our logic was that viruses would have greater effect if they worked upstream of the production of interferon molecules because, once interferon is produced, hundreds of interferon-stimulated proteins would activate and the virus would need to counteract them one-by-one. Results We curated multiple sequence alignments of primate orthologs for 131 genes active in interferon production and signaling (herein, “induction” genes), 100 interferon-stimulated genes, and 100 randomly chosen genes. We analyzed each multiple sequence alignment for the signatures of recurrent positive selection. Counter to our hypothesis, we found the interferon-stimulated genes, and not interferon induction genes, are evolving significantly more rapidly than a random set of genes. Interferon induction genes evolve in a way that is indistinguishable from a matched set of random genes (22% and 18% of genes bear signatures of positive selection, respectively). In contrast, interferon-stimulated genes evolve differently, with 33% of genes evolving under positive selection and containing a significantly higher fraction of codons that have experienced selection for recurrent replacement of the encoded amino acid. Conclusion Viruses may antagonize individual products of the interferon response more often than trying to neutralize the system altogether.


Sign in / Sign up

Export Citation Format

Share Document