scholarly journals Tissue-specific mouse mRNA isoform networks

2019 ◽  
Author(s):  
Gaurav Kandoi ◽  
Julie A. Dickerson

AbstractAlternative Splicing produces multiple mRNA isoforms of genes which have important diverse roles such as regulation of gene expression, human heritable diseases, and response to environmental stresses. However, little has been done to assign functions at the mRNA isoform level. Functional networks, where the interactions are quantified by their probability of being involved in the same biological process are typically generated at the gene level. We use a diverse array of tissue-specific RNA-seq datasets and sequence information to train random forest models that predict the functional networks. Since there is no mRNA isoform-level gold standard, we use single isoform genes co-annotated to Gene Ontology biological process annotations, Kyoto Encyclopedia of Genes and Genomes pathways, BioCyc pathways and protein-protein interactions as functionally related (positive pair). To generate the non-functional pairs (negative pair), we use the Gene Ontology annotations tagged with “NOT” qualifier. We describe 17 Tissue-spEcific mrNa iSoform functIOnal Networks (TENSION) following a leave-one-tissue-out strategy in addition to an organism level reference functional network for mouse. We validate our predictions by comparing its performance with previous methods, randomized positive and negative class labels, updated Gene Ontology annotations, and by literature evidence. We demonstrate the ability of our networks to reveal tissue-specific functional differences of the isoforms of the same genes.

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Gaurav Kandoi ◽  
Julie A. Dickerson

Abstract Alternative Splicing produces multiple mRNA isoforms of genes which have important diverse roles such as regulation of gene expression, human heritable diseases, and response to environmental stresses. However, little has been done to assign functions at the mRNA isoform level. Functional networks, where the interactions are quantified by their probability of being involved in the same biological process are typically generated at the gene level. We use a diverse array of tissue-specific RNA-seq datasets and sequence information to train random forest models that predict the functional networks. Since there is no mRNA isoform-level gold standard, we use single isoform genes co-annotated to Gene Ontology biological process annotations, Kyoto Encyclopedia of Genes and Genomes pathways, BioCyc pathways and protein-protein interactions as functionally related (positive pair). To generate the non-functional pairs (negative pair), we use the Gene Ontology annotations tagged with “NOT” qualifier. We describe 17 Tissue-spEcific mrNa iSoform functIOnal Networks (TENSION) following a leave-one-tissue-out strategy in addition to an organism level reference functional network for mouse. We validate our predictions by comparing its performance with previous methods, randomized positive and negative class labels, updated Gene Ontology annotations, and by literature evidence. We demonstrate the ability of our networks to reveal tissue-specific functional differences of the isoforms of the same genes. All scripts and data from TENSION are available at: 10.25380/iastate.c.4275191.


Author(s):  
Sambit Kumar Mishra ◽  
Viraj Muthye ◽  
Gaurav Kandoi

Multiple mRNA isoforms of the same gene are produced via alternative splicing, a biological mechanism that regulates protein diversity while maintaining genome size. Alternatively spliced mRNA isoforms of the same gene may sometimes have very similar sequence, but they can have significantly diverse effects on cellular function and regulation. The products of alternative splicing have important and diverse functional roles, such as response to environmental stress, regulation of gene expression, human heritable and plant diseases. The mRNA isoforms of the same gene, such as the apoptosis associated CASP3 gene, can have dramatically different functions. The shorter mRNA isoform product CASP3-S inhibits apoptosis, while the longer CASP3-L mRNA isoform promotes apoptosis. Despite the functional importance of mRNA isoforms, very little has been done to annotate their functions. The recent years have however seen the development of several computational methods aimed at predicting mRNA isoform level biological functions. These methods use a wide array of proteo-genomic data to develop machine learning-based mRNA isoform function prediction tools. In this review, we discuss the computational methods developed for predicting the biological function at the individual mRNA isoform level.


2020 ◽  
Vol 21 (16) ◽  
pp. 5686
Author(s):  
Sambit K. Mishra ◽  
Viraj Muthye ◽  
Gaurav Kandoi

Multiple mRNA isoforms of the same gene are produced via alternative splicing, a biological mechanism that regulates protein diversity while maintaining genome size. Alternatively spliced mRNA isoforms of the same gene may sometimes have very similar sequence, but they can have significantly diverse effects on cellular function and regulation. The products of alternative splicing have important and diverse functional roles, such as response to environmental stress, regulation of gene expression, human heritable, and plant diseases. The mRNA isoforms of the same gene can have dramatically different functions. Despite the functional importance of mRNA isoforms, very little has been done to annotate their functions. The recent years have however seen the development of several computational methods aimed at predicting mRNA isoform level biological functions. These methods use a wide array of proteo-genomic data to develop machine learning-based mRNA isoform function prediction tools. In this review, we discuss the computational methods developed for predicting the biological function at the individual mRNA isoform level.


2019 ◽  
Author(s):  
Hassan Kané ◽  
Mohamed Coulibali ◽  
Ali Abdalla ◽  
Pelkins Ajanoh

ABSTRACTComputational methods that infer the function of proteins are key to understanding life at the molecular level. In recent years, representation learning has emerged as a powerful paradigm to discover new patterns among entities as varied as images, words, speech, molecules. In typical representation learning, there is only one source of data or one level of abstraction at which the learned representation occurs. However, proteins can be described by their primary, secondary, tertiary, and quaternary structure or even as nodes in protein-protein interaction networks. Given that protein function is an emergent property of all these levels of interactions in this work, we learn joint representations from both amino acid sequence and multilayer networks representing tissue-specific protein-protein interactions. Using these hybrid representations, we show that simple machine learning models trained using these hybrid representations outperform existing network-based methods on the task of tissue-specific protein function prediction on 13 out of 13 tissues. Furthermore, these representations outperform existing ones by 14% on average.


2020 ◽  
Vol 27 (4) ◽  
pp. 313-320 ◽  
Author(s):  
Xuan Xiao ◽  
Wei-Jie Chen ◽  
Wang-Ren Qiu

Background: The information of quaternary structure attributes of proteins is very important because it is closely related to the biological functions of proteins. With the rapid development of new generation sequencing technology, we are facing a challenge: how to automatically identify the four-level attributes of new polypeptide chains according to their sequence information (i.e., whether they are formed as just as a monomer, or as a hetero-oligomer, or a homo-oligomer). Objective: In this article, our goal is to find a new way to represent protein sequences, thereby improving the prediction rate of protein quaternary structure. Methods: In this article, we developed a prediction system for protein quaternary structural type in which a protein sequence was expressed by combining the Pfam functional-domain and gene ontology. turn protein features into digital sequences, and complete the prediction of quaternary structure through specific machine learning algorithms and verification algorithm. Results: Our data set contains 5495 protein samples. Through the method provided in this paper, we classify proteins into monomer, or as a hetero-oligomer, or a homo-oligomer, and the prediction rate is 74.38%, which is 3.24% higher than that of previous studies. Through this new feature extraction method, we can further classify the four-level structure of proteins, and the results are also correspondingly improved. Conclusion: After the applying the new prediction system, compared with the previous results, we have successfully improved the prediction rate. We have reason to believe that the feature extraction method in this paper has better practicability and can be used as a reference for other protein classification problems.


Membranes ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 376
Author(s):  
Norhidayah Azmi ◽  
Nurulhasanah Othman

Amoebiasis is caused by Entamoeba histolytica and ranked second for parasitic diseases causing death after malaria. E. histolytica membrane and cytosolic proteins play important roles in the pathogenesis. Our previous study had shown several cytosolic proteins were found in the membrane fraction. Therefore, this study aimed to quantify the differential abundance of membrane and cytosolic proteins in membrane versus cytosolic fractions and analyze their predicted functions and interaction. Previous LC-ESI-MS/MS data were analyzed by PERSEUS software for the differentially abundant proteins, then they were classified into their functional annotations and the protein networks were summarized using PantherDB and STRiNG, respectively. The results showed 24 (44.4%) out of the 54 proteins that increased in abundance were membrane proteins and 30 were cytosolic proteins. Meanwhile, 45 cytosolic proteins were found to decrease in abundance. Functional analysis showed differential abundance proteins involved in the molecular function, biological process, and cellular component with 18.88%, 33.04% and, 48.07%, respectively. The STRiNG server predicted that the decreased abundance proteins had more protein–protein network interactions compared to increased abundance proteins. Overall, this study has confirmed the presence of the differentially abundant membrane and cytosolic proteins and provided the predictive functions and interactions between them.


2011 ◽  
Vol 28 (1) ◽  
pp. 69-75 ◽  
Author(s):  
Stefan R. Maetschke ◽  
Martin Simonsen ◽  
Melissa J. Davis ◽  
Mark A. Ragan

1994 ◽  
Vol 14 (10) ◽  
pp. 6635-6646
Author(s):  
J A Diehl ◽  
M Hannink

Protein-protein interactions between the CCAAT box enhancer-binding proteins (C/EBP) and the Rel family of transcription factors have been implicated in the regulation of cytokine gene expression. We have used sequence-specific DNA affinity chromatography to purify a complex from avian T cells that binds to a consensus C/EBP motif. Our results provide evidence that Rel-related proteins are components of the C/EBP-DNA complex as a result of protein-protein interactions with the C/EBP proteins. A polyclonal antiserum raised against the Rel homology domain of v-Rel and antisera raised against two human RelA-derived peptides specifically induced a supershift of the C/EBP-DNA complex in mobility shift assays using the affinity-purified C/EBP. In addition, several kappa B-binding proteins copurified with the avian C/EBP complex through two rounds of sequence-specific DNA affinity chromatography. The kappa B-binding proteins are distinct from the C/EBP proteins that directly contact DNA containing the C/EBP binding site. The identification of a protein complex that binds specifically to a consensus C/EBP site and contains both C/EBP and Rel family members suggests a novel mechanism for regulation of gene expression by Rel family proteins.


2021 ◽  
Author(s):  
Elisabeth Holzer ◽  
Cornelia Rumpf-Kienzl ◽  
Sebastian Falk ◽  
Alexander Dammermann

Proximity-dependent labeling approaches such as BioID have been a great boon to studies of protein-protein interactions in the context of cytoskeletal structures such as centrosomes which are poorly amenable to traditional biochemical approaches like immunoprecipitation and tandem affinity purification. Yet, these methods have so far not been applied extensively to invertebrate experimental models such as C. elegans given the long labeling times required for the original promiscuous biotin ligase variant BirA*. Here, we show that the recently developed variant TurboID successfully probes the interactomes of both stably associated (SPD-5) and dynamically localized (PLK-1) centrosomal components. We further develop an indirect proximity labeling method employing a GFP nanobody- TurboID fusion, which allows the identification of protein interactors in a tissue-specific manner in the context of the whole animal. Critically, this approach utilizes available endogenous GFP fusions, avoiding the need to generate multiple additional strains for each target protein and the potential complications associated with overexpressing the protein from transgenes. Using this method, we identify homologs of two highly conserved centriolar components, Cep97 and Bld10/Cep135, which are present in various somatic tissues of the worm. Surprisingly, neither protein is expressed in early embryos, likely explaining why these proteins have escaped attention until now. Our work expands the experimental repertoire for C. elegans and opens the door for further studies of tissue-specific variation in centrosome architecture.


Sign in / Sign up

Export Citation Format

Share Document