SITEX 2.0: Projections of protein functional sites on eukaryotic genes. Extension with orthologous genes

2017 ◽  
Vol 15 (02) ◽  
pp. 1650044 ◽  
Author(s):  
Irina V. Medvedeva ◽  
Pavel S. Demenkov ◽  
Vladimir A. Ivanisenko

Functional sites define the diversity of protein functions and are the central object of research of the structural and functional organization of proteins. The mechanisms underlying protein functional sites emergence and their variability during evolution are distinguished by duplication, shuffling, insertion and deletion of the exons in genes. The study of the correlation between a site structure and exon structure serves as the basis for the in-depth understanding of sites organization. In this regard, the development of programming resources that allow the realization of the mutual projection of exon structure of genes and primary and tertiary structures of encoded proteins is still the actual problem. Previously, we developed the SitEx system that provides information about protein and gene sequences with mapped exon borders and protein functional sites amino acid positions. The database included information on proteins with known 3D structure. However, data with respect to orthologs was not available. Therefore, we added the projection of sites positions to the exon structures of orthologs in SitEx 2.0. We implemented a search through database using site conservation variability and site discontinuity through exon structure. Inclusion of the information on orthologs allowed to expand the possibilities of SitEx usage for solving problems regarding the analysis of the structural and functional organization of proteins. Database URL: http://www-bionet.sscc.ru/sitex/ .

2021 ◽  
Vol 1 ◽  
Author(s):  
Daniel Corcoran ◽  
Nicholas Maltbie ◽  
Shivchander Sudalairaj ◽  
Frazier N. Baker ◽  
Joseph Hirschfeld ◽  
...  

Proteins by and large carry out their molecular functions in a folded state when residues, distant in sequence, assemble together in 3D space to bind a ligand, catalyze a reaction, form a channel, or exert another concerted macromolecular interaction. It has been long recognized that covariance of amino acids between distant positions within a protein sequence allows for the inference of long range contacts to facilitate 3D structure modeling. In this work, we investigated whether covariance analysis may reveal residues involved in the same molecular function. Building upon our previous work, CoeViz, we have conducted a large scale covariance analysis among 7,595 non-redundant proteins with resolved 3D structures to assess 1) whether the residues with the same function coevolve, 2) which covariance metric captures such couplings better, and 3) how different molecular functions compare in this context. We found that the chi-squared metric is the most informative for the identification of coevolving functional sites, followed by the Pearson correlation-based, whereas mutual information is the least informative. Of the seven categories of the most common natural ligands, including coenzyme A, dinucleotide, DNA/RNA, heme, metal, nucleoside, and sugar, the trace metal binding residues display the most prominent coupling, followed by the sugar binding sites. We also developed a web-based tool, CoeViz 2, that enables the interactive visualization of covarying residues as cliques from a larger protein graph. CoeViz 2 is publicly available at https://research.cchmc.org/CoevLab/.


2011 ◽  
Vol 40 (D1) ◽  
pp. D278-D283 ◽  
Author(s):  
I. Medvedeva ◽  
P. Demenkov ◽  
N. Kolchanov ◽  
V. Ivanisenko

1991 ◽  
Vol 11 (3) ◽  
pp. 1306-1312 ◽  
Author(s):  
G A Gonzalez ◽  
P Menzel ◽  
J Leonard ◽  
W H Fischer ◽  
M R Montminy

Cyclic AMP mediates the hormonal stimulation of a number of eukaryotic genes by directing the protein kinase A (PK-A)-dependent phosphorylation of transcription factor CREB. We have previously determined that although phosphorylation at Ser-133 is critical for induction, this site does not appear to participate directly in transactivation. To test the hypothesis that CREB ultimately activates transcription through domains that are distinct from the PK-A site, we constructed a series of CREB mutants and evaluated them by transient assays in F9 teratocarcinoma cells. Remarkably, a glutamine-rich region near the N terminus appeared to be important for PK-A-mediated induction of CREB since removal of this domain caused a marked reduction in CREB activity. A second region consisting of a short acidic motif (DLSSD) C terminal to the PK-A site also appeared to synergize with the phosphorylation motif to permit transcriptional activation. Biochemical experiments with purified recombinant CREB protein further demonstrate that the transactivation domain is more sensitive to trypsin digestion than are the DNA-binding and dimerization domains, suggesting that the activator region may be structured to permit interactions with other proteins in the RNA polymerase II complex.


2014 ◽  
Vol 13 ◽  
pp. S43
Author(s):  
B. Hoffmann ◽  
J.-P. Mornon ◽  
B. Boucherie ◽  
A. Fortuné ◽  
R. Haudecoeur ◽  
...  

2004 ◽  
Vol 1 (1) ◽  
pp. 80-89
Author(s):  
Guido Dieterich ◽  
Dirk W. Heinz ◽  
Joachim Reichelt

Abstract The 3D structures of biomacromolecules stored in the Protein Data Bank [1] were correlated with different external, biological information from public databases. We have matched the feature table of SWISS-PROT [2] entries as well InterPro [3] domains and function sites with the corresponding 3D-structures. OMIM [4] (Online Mendelian Inheritance in Man) records, containing information of genetic disorders, were extracted and linked to the structures. The exhaustive all-against-all 3D structure comparison of protein structures stored in DALI [5] was condensed into single files for each PDB entry. Results are stored in XML format facilitating its incorporation into related software. The resulting annotation of the protein structures allows functional sites to be identified upon visualization.


2017 ◽  
Author(s):  
Mohammad Nauman ◽  
Hafeez Ur Rehman ◽  
Gianfranco Politano ◽  
Alfredo Benso

ABSTRACTAccurate annotation of protein functions is important for a profound understanding of molecular biology. A large number of proteins remain uncharacterized because of the sparsity of available supporting information. For a large set of uncharacterized proteins, the only type of information available is their amino acid sequence. In this paper, we propose DeepSeq – a deep learning architecture – that utilizes only the protein sequence information to predict its associated functions. The prediction process does not require handcrafted features; rather, the architecture automatically extracts representations from the input sequence data. Results of our experiments with DeepSeq indicate significant improvements in terms of prediction accuracy when compared with other sequence-based methods. Our deep learning model achieves an overall validation accuracy of 86.72%, with an F1 score of 71.13%. Moreover, using the automatically learned features and without any changes to DeepSeq, we successfully solved a different problem i.e. protein function localization, with no human intervention. Finally, we discuss how this same architecture can be used to solve even more complicated problems such as prediction of 2D and 3D structure as well as protein-protein interactions.


2021 ◽  
Author(s):  
Mitzi Díaz-Hernández ◽  
Rosario Javier Reyna ◽  
Izaid Sotto-Ortega ◽  
Guillermina García-Rivera ◽  
Maricela Sarita Montaño ◽  
...  

AbstractDuring phagocytosis, a key event in the virulence of the protozoan Entamoeba histolytica, several molecules in concert contact the target, generate pseudopodia, and internalize and digest the ingested prey. Posttranslational modifications provide proteins the timing and signaling to intervene in these processes. SUMOylation is a posttranslational modification that in several systems grants a fine tuning for protein functions, protein interactions and cellular location, but it has not been studied in E. histolytica. In this paper, we characterized the E. histolytica SUMO gene and its product (EhSUMO) and elucidated the EhSUMO 3D-structure. Furthermore, here we studied the relevance of SUMOylation in phagocytosis, particularly in its association with EhADH (an ALIX family protein) and EhVps32 (a protein of the ESCRT-III complex), both involved in phagocytosis. Our results indicated that EhSUMO has an extended N-terminus that differentiates other SUMO from ubiquitin. It also presents the GG residues at the C-terminus and the ΨKXE/D binding motif, both involved in target protein contact. Additionally, E. histolytica genome possesses the enzymes belonging to the SUMOylation-deSUMOylation machineries. Confocal microscopy assays, using α−EhSUMO antibodies disclosed a remarkable membrane activity with convoluted and changing structures in trophozoites during erythrophagocytosis. SUMOylated proteins appeared in pseudopodia, phagocytic channels, and around the adhered and ingested erythrocytes. Docking analysis predicted interaction of EhSUMO with EhADH, and immunoprecipitation and immunofluorescence assays revealed that the EhADH-EhSUMO association increased during phagocytosis, whereas the EhVps32-EhSUMO interaction appeared stronger since basal conditions. In EhSUMO knocked down trophozoites, the bizarre membranous structures disappeared, and EhSUMO interaction with EhADH and EhVps32 diminished. Our results evidenced the presence of a SUMO gene in E. histolytica and the SUMOylation relevance during phagocytosis.Author’s AbstractPhagocytosis is one of the main functions that Entamoeba histolyitica trophozoites carry out during the invasion to the host. Many proteins are involved in this fascinating event, in which the plasmatic membrane undergoes to multiple and speedy changes. Posttraductional modifications activate proteins in the precise time that they must get involved. SUMOylation, that consists in the non-covalent binding of SUMO protein with target molecules, is one of the main changes suffered by proteins in order to enable them to participate in cellular functions. SUMOylation had not been studied in E. histolytica nor in phagocytosis, and our working hypothesis is that this event is deeply engaged in the ingestion of target molecules and cells. The results of this paper prove the presence of an intronless bona fide EhSUMO gene encoding for a predicted 12.6 kDa protein that is actively involved in phagocytosis. Silencing of the EhSUMO gene affected the rate of phagocytosis and interfered with the EhADH and EhVps32 function, two proteins involved in phagocytosis, strongly supporting the importance of SUMOylation in this event.


2019 ◽  
Vol 400 (3) ◽  
pp. 275-288 ◽  
Author(s):  
Kale Kundert ◽  
Tanja Kortemme

Abstract The ability to engineer the precise geometries, fine-tuned energetics and subtle dynamics that are characteristic of functional proteins is a major unsolved challenge in the field of computational protein design. In natural proteins, functional sites exhibiting these properties often feature structured loops. However, unlike the elements of secondary structures that comprise idealized protein folds, structured loops have been difficult to design computationally. Addressing this shortcoming in a general way is a necessary first step towards the routine design of protein function. In this perspective, we will describe the progress that has been made on this problem and discuss how recent advances in the field of loop structure prediction can be harnessed and applied to the inverse problem of computational loop design.


Author(s):  
Grey W. Wilburn ◽  
Sean R. Eddy

AbstractMost methods for biological sequence homology search and alignment work with primary sequence alone, neglecting higher-order correlations. Recently, statistical physics models called Potts models have been used to infer all-by-all pairwise correlations between sites in deep multiple sequence alignments, and these pairwise couplings have improved 3D structure predictions. Here we extend the use of Potts models from structure prediction to sequence alignment and homology search by developing what we call a hidden Potts model (HPM) that merges a Potts emission process to a generative probability model of insertion and deletion. Because an HPM is incompatible with efficient dynamic programming alignment algorithms, we develop an approximate algorithm based on importance sampling, using simpler probabilistic models as proposal distributions. We test an HPM implementation on RNA structure homology search benchmarks, where we can compare directly to exact alignment methods that capture nested RNA base-pairing correlations (stochastic context-free grammars). HPMs perform promisingly in these proof of principle experiments.Author summaryComputational homology search and alignment tools are used to infer the functions and evolutionary histories of biological sequences. Most widely used tools for sequence homology searches, such as BLAST and HMMER, rely on primary sequence conservation alone. It should be possible to make more powerful search tools by also considering higher-order covariation patterns induced by 3D structure conservation. Recent advances in 3D protein structure prediction have used a class of statistical physics models called Potts models to infer pairwise correlation structure in multiple sequence alignments. However, Potts models assume alignments are given and cannot build new alignments, limiting their use in homology search. We have extended Potts models to include a probability model of insertion and deletion so they can be applied to sequence alignment and remote homology search using a new model we call a hidden Potts model (HPM). Tests of our prototype HPM software show promising results in initial benchmarking experiments, though more work will be needed to use HPMs in practical tools.


1991 ◽  
Vol 11 (3) ◽  
pp. 1306-1312
Author(s):  
G A Gonzalez ◽  
P Menzel ◽  
J Leonard ◽  
W H Fischer ◽  
M R Montminy

Cyclic AMP mediates the hormonal stimulation of a number of eukaryotic genes by directing the protein kinase A (PK-A)-dependent phosphorylation of transcription factor CREB. We have previously determined that although phosphorylation at Ser-133 is critical for induction, this site does not appear to participate directly in transactivation. To test the hypothesis that CREB ultimately activates transcription through domains that are distinct from the PK-A site, we constructed a series of CREB mutants and evaluated them by transient assays in F9 teratocarcinoma cells. Remarkably, a glutamine-rich region near the N terminus appeared to be important for PK-A-mediated induction of CREB since removal of this domain caused a marked reduction in CREB activity. A second region consisting of a short acidic motif (DLSSD) C terminal to the PK-A site also appeared to synergize with the phosphorylation motif to permit transcriptional activation. Biochemical experiments with purified recombinant CREB protein further demonstrate that the transactivation domain is more sensitive to trypsin digestion than are the DNA-binding and dimerization domains, suggesting that the activator region may be structured to permit interactions with other proteins in the RNA polymerase II complex.


Sign in / Sign up

Export Citation Format

Share Document