scholarly journals MotSASi: Functional Short Linear Motifs (SLiMs) prediction based on genomic single nucleotide variants and structural data

2021 ◽  
Author(s):  
Mariano Martin ◽  
Carlos Pablo Modenutti ◽  
Juan Pablo Nicola ◽  
Marcelo Adrian Marti

Short linear motifs (SLiMs) are key to cell physiology mediating reversible protein-protein interactions. Precise identification of SLiMs remains a challenge, being the main drawback of most bioinformatic prediction tools their low specificity (high number of false positives). An important, usually overlooked, aspect is the relation between SLiMs mutations and disease. The presence of variants in each residue position can be used to assess the relevance of the corresponding residue(s) for protein function, and its (in)tolerance to change. In the present work, we combined sequence variant information and structural analysis of the energetic impact of single amino acid substitution (SAS) in SLiM-Receptor complex structure, and showed that it significantly improves prediction of true functional SLiMs. Our strategy is based on building a SAS tolerance matrix that shows, for each position, whether one of the possible 19 SAS is tolerated or not. Herein we present the MotSASi strategy and analyze in detail 4 SLiMs involved in intracellular protein trafficking. Our results show that inclusion of variant and sequence information significantly improves both prediction of true SLiMs and rejection of false positives, while also allowing better classification of variants inside SLiMs, a results with a direct impact in clinical genomics.

2018 ◽  
Author(s):  
Leandro Radusky ◽  
Carlos Modenutti ◽  
Javier Delgado ◽  
Juan P. Bustamante ◽  
Sebastian Vishnopolska ◽  
...  

AbstractUnderstanding the functional effect of Single Amino acid Substitutions (SAS), derived from the occurrence of single nucleotide variants (SNVs), and their relation to disease development is a major issue in clinical genomics. Even though there are several bioinformatic algorithms and servers that predict if a SAS can be pathogenic or not they give little or non-information on the actual effect on the protein function. Moreover, many of these algorithms are able to predict an effect that no necessarily translates directly into pathogenicity. VarQ Web Server is an online tool that given an UniProt id automatically analyzes known and user provided SAS for their effect on protein activity, folding, aggregation and protein interactions among others. VarQ assessment was performed over a set of previously manually curated variants, showing its ability to correctly predict the phenotypic outcome and its underlying cause. This resource is available online at http://varq.qb.fcen.uba.ar/.Contact: [email protected] Information & Tutorials may be found in the webpage of the tool.


2018 ◽  
Vol 87 (1) ◽  
pp. 921-964 ◽  
Author(s):  
David L. Brautigan ◽  
Shirish Shenolikar

Protein serine/threonine phosphatases (PPPs) are ancient enzymes, with distinct types conserved across eukaryotic evolution. PPPs are segregated into types primarily on the basis of the unique interactions of PPP catalytic subunits with regulatory proteins. The resulting holoenzymes dock substrates distal to the active site to enhance specificity. This review focuses on the subunit and substrate interactions for PPP that depend on short linear motifs. Insights about these motifs from structures of holoenzymes open new opportunities for computational biology approaches to elucidate PPP networks. There is an expanding knowledge base of posttranslational modifications of PPP catalytic and regulatory subunits, as well as of their substrates, including phosphorylation, acetylation, and ubiquitination. Cross talk between these posttranslational modifications creates PPP-based signaling. Knowledge of PPP complexes, signaling clusters, as well as how PPPs communicate with each other in response to cellular signals should unlock the doors to PPP networks and signaling “clouds” that orchestrate and coordinate different aspects of cell physiology.


Toxins ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 290
Author(s):  
Caterina Peggion ◽  
Fiorella Tonello

Snake venom phospholipases A2 (PLA2s) have sequences and structures very similar to those of mammalian group I and II secretory PLA2s, but they possess many toxic properties, ranging from the inhibition of coagulation to the blockage of nerve transmission, and the induction of muscle necrosis. The biological properties of these proteins are not only due to their enzymatic activity, but also to protein–protein interactions which are still unidentified. Here, we compare sequence alignments of snake venom and mammalian PLA2s, grouped according to their structure and biological activity, looking for differences that can justify their different behavior. This bioinformatics analysis has evidenced three distinct regions, two central and one C-terminal, having amino acid compositions that distinguish the different categories of PLA2s. In these regions, we identified short linear motifs (SLiMs), peptide modules involved in protein–protein interactions, conserved in mammalian and not in snake venom PLA2s, or vice versa. The different content in the SLiMs of snake venom with respect to mammalian PLA2s may result in the formation of protein membrane complexes having a toxic activity, or in the formation of complexes whose activity cannot be blocked due to the lack of switches in the toxic PLA2s, as the motif recognized by the prolyl isomerase Pin1.


2015 ◽  
Vol 24 (4) ◽  
pp. 197-205
Author(s):  
Dwi Wulandari ◽  
Lisnawati Rachmadi ◽  
Tjahjani M. Sudiro

Background: E6 and E7 are oncoproteins of HPV16. Natural amino acid variation in HPV16 E6 can alter its carcinogenic potential. The aim of this study was to analyze phylogenetically E6 and E7 genes and proteins of HPV16 from Indonesia and predict the effects of single amino acid substitution on protein function. This analysis could be used to reduce time, effort, and research cost as initial screening in selection of protein or isolates to be tested in vitro or in vivo.Methods: In this study, E6 and E7 gene sequences were obtained from 12 samples of  Indonesian isolates, which  were compared with HPV16R (prototype) and 6 standard isolates in the category of European (E), Asian (As), Asian-American (AA), African-1 (Af-1), African-2 (Af-2), and North American (NA) branch from Genbank. Bioedit v.7.0.0 was used to analyze the composition and substitution of single amino acids. Phylogenetic analysis of E6 and E7 genes and proteins was performed using Clustal X (1.81) and NJPLOT softwares. Effects of single amino acid substitutions on protein function of E6 and E7 were analysed by SNAP.Results: Java variants and isolate ui66* belonged to European branch, while the others belonged to Asian and African branches. Twelve changes of amino acids were found in E6 and one in E7 proteins. SNAP analysis showed two non neutral mutations, i.e. R10I and C63G in E6 proteins. R10I mutations were found in Af-2 genotype (AF472509) and Indonesian isolates (Af2*), while C63G mutation was found only in Af2*.Conclusion: E6 proteins of HPV16 variants were more variable than E7. SNAP analysis showed that only E6 protein of African-2 branch had functional differences compared to HPV16R.


2020 ◽  
Vol 38 (1) ◽  
pp. 113-127 ◽  
Author(s):  
Peter Hraber ◽  
Paul E. O’Maille ◽  
Andrew Silberfarb ◽  
Katie Davis-Anderson ◽  
Nicholas Generous ◽  
...  

2015 ◽  
Vol 87 (2 suppl) ◽  
pp. 1273-1292 ◽  
Author(s):  
David Z. Mokry ◽  
Josielle Abrahão ◽  
Carlos H.I. Ramos

The process of folding is a seminal event in the life of a protein, as it is essential for proper protein function and therefore cell physiology. Inappropriate folding, or misfolding, can not only lead to loss of function, but also to the formation of protein aggregates, an insoluble association of polypeptides that harm cell physiology, either by themselves or in the process of formation. Several biological processes have evolved to prevent and eliminate the existence of non-functional and amyloidogenic aggregates, as they are associated with several human pathologies. Molecular chaperones and heat shock proteins are specialized in controlling the quality of the proteins in the cell, specifically by aiding proper folding, and dissolution and clearance of already formed protein aggregates. The latter is a function of disaggregases, mainly represented by the ClpB/Hsp104 subfamily of molecular chaperones, that are ubiquitous in all organisms but, surprisingly, have no orthologs in the cytosol of metazoan cells. This review aims to describe the characteristics of disaggregases and to discuss the function of yeast Hsp104, a disaggregase that is also involved in prion propagation and inheritance.


2021 ◽  
Author(s):  
Céline Marquet ◽  
Michael Heinzinger ◽  
Tobias Olenyi ◽  
Christian Dallago ◽  
Michael Bernhofer ◽  
...  

Abstract The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (LMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences. These methods learn to predict missing or marked amino acids from the context of entire sequence regions. Here, we explored how to benefit from learned protein LM representations (embeddings) to predict SAV effects. Although we have failed so far to predict SAV effects directly from embeddings, this input alone predicted residue conservation almost as accurately from single sequences as using multiple sequence alignments (MSAs) with a two-state per-residue accuracy (conserved/not) of Q2=80% (embeddings) vs. 81% (ConSeq). Considering all SAVs at all residue positions predicted as conserved to affect function reached 68.6% (Q2: effect/neutral; for PMD) without optimization, compared to an expert solution such as SNAP2 (Q2=69.8). Combining predicted conservation with BLOSUM62 to obtain variant-specific binary predictions, DMS experiments of four human proteins were predicted better than by SNAP2, and better than by applying the same simplistic approach to conservation taken from ConSeq. Thus, embedding methods have become competitive with methods relying on MSAs for SAV effect prediction at a fraction of the costs in computing/energy. This allowed prediction of SAV effects for the entire human proteome (~20k proteins) within 17 minutes on a single GPU.


eLife ◽  
2015 ◽  
Vol 4 ◽  
Author(s):  
Manon Baëza ◽  
Séverine Viala ◽  
Marjorie Heim ◽  
Amélie Dard ◽  
Bruno Hudry ◽  
...  

Hox proteins are well-established developmental regulators that coordinate cell fate and morphogenesis throughout embryogenesis. In contrast, our knowledge of their specific molecular modes of action is limited to the interaction with few cofactors. Here, we show that Hox proteins are able to interact with a wide range of transcription factors in the live Drosophila embryo. In this context, specificity relies on a versatile usage of conserved short linear motifs (SLiMs), which, surprisingly, often restrains the interaction potential of Hox proteins. This novel buffering activity of SLiMs was observed in different tissues and found in Hox proteins from cnidarian to mouse species. Although these interactions remain to be analysed in the context of endogenous Hox regulatory activities, our observations challenge the traditional role assigned to SLiMs and provide an alternative concept to explain how Hox interactome specificity could be achieved during the embryonic development.


2021 ◽  
Author(s):  
Céline Marquet ◽  
Michael Heinzinger ◽  
Tobias Olenyi ◽  
Christian Dallago ◽  
Kyra Erckert ◽  
...  

AbstractThe emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (pLMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences. These methods learn to predict missing or masked amino acids from the context of entire sequence regions. Here, we used pLM representations (embeddings) to predict sequence conservation and SAV effects without multiple sequence alignments (MSAs). Embeddings alone predicted residue conservation almost as accurately from single sequences as ConSeq using MSAs (two-state Matthews Correlation Coefficient—MCC—for ProtT5 embeddings of 0.596 ± 0.006 vs. 0.608 ± 0.006 for ConSeq). Inputting the conservation prediction along with BLOSUM62 substitution scores and pLM mask reconstruction probabilities into a simplistic logistic regression (LR) ensemble for Variant Effect Score Prediction without Alignments (VESPA) predicted SAV effect magnitude without any optimization on DMS data. Comparing predictions for a standard set of 39 DMS experiments to other methods (incl. ESM-1v, DeepSequence, and GEMME) revealed our approach as competitive with the state-of-the-art (SOTA) methods using MSA input. No method outperformed all others, neither consistently nor statistically significantly, independently of the performance measure applied (Spearman and Pearson correlation). Finally, we investigated binary effect predictions on DMS experiments for four human proteins. Overall, embedding-based methods have become competitive with methods relying on MSAs for SAV effect prediction at a fraction of the costs in computing/energy. Our method predicted SAV effects for the entire human proteome (~ 20 k proteins) within 40 min on one Nvidia Quadro RTX 8000. All methods and data sets are freely available for local and online execution through bioembeddings.com, https://github.com/Rostlab/VESPA, and PredictProtein.


Sign in / Sign up

Export Citation Format

Share Document