scholarly journals EmpiReS: Differential Analysis of Gene Expression and Alternative Splicing

2020 ◽  
Author(s):  
Gergely Csaba ◽  
Evi Berchtold ◽  
Armin Hadziahmetovic ◽  
Markus Gruber ◽  
Constantin Ammar ◽  
...  

ABSTRACTWhile absolute quantification is challenging in high-throughput measurements, changes of features between conditions can often be determined with high precision. Therefore, analysis of fold changes is the standard method, but often, a doubly differential analysis of changes of changes is required. Differential alternative splicing is an example of a doubly differential analysis, i.e. fold changes between conditions for different isoforms of a gene. EmpiRe is a quantitative approach for various kinds of omics data based on fold changes for appropriate features of biological objects. Empirical error distributions for these fold changes are estimated from Replicate measurements and used to quantify feature fold changes and their directions. We assess the performance of EmpiRe to detect differentially expressed genes applied to RNA-Seq using simulated data. It achieved higher precision than established tools at nearly the same recall level. Furthermore, we assess the detection of alternatively Spliced genes via changes of isoform fold changes (EmpiReS) on distribution-free simulations and experimentally validated splicing events. EmpiReS achieves the best precision-recall values for simulations based on different biological datasets. We propose EmpiRe(S) as a general, quantitative and fast approach with high reliability and an excellent trade-off between sensitivity and precision in (doubly) differential analyses.

2020 ◽  
Author(s):  
Zixiao Zhao ◽  
Christine G. Elsik ◽  
Bruce E. Hibbard ◽  
Kent S. Shelby

AbstractBackgroundAlternative splicing is one of the major mechanisms that increases transcriptome diversity in eukaryotes, including insect species that have gained resistance to pesticides and Bt toxins. In western corn rootworm (Diabrotica virgifera virgifera LeConte), neither alternative splicing nor its role in resistance to Bt toxins has been studied.ResultsTo investigate the mechanisms of Bt resistance we carried out single-molecule real-time (SMRT) transcript sequencing and Iso-seq analysis on resistant, eCry3.1Ab-selected and susceptible, unselected, western corn rootworm neonate midguts which fed on seedling maize with and without eCry3.1Ab for 12 and 24 hours. We present transcriptome-wide alternative splicing patterns of western corn rootworm midgut in response to feeding on eCry3.1Ab-expressing corn using a comprehensive approach that combines both RNA-seq and SMRT transcript sequencing techniques. We found that 67.73% of multi-exon genes are alternatively spliced, which is consistent with the high transposable element content of the genome. One of the alternative splicing events we identified was a novel peritrophic matrix protein with two alternative splicing isoforms. Analysis of differential exon usage between resistant and susceptible colonies showed that in eCry3.1Ab-resistant western corn rootworm, expression of one isoform was significantly higher than in the susceptible colony, while no significant differences between colonies were observed with the other isoform.ConclusionOur results provide the first survey of alternative splicing in western corn rootworm and suggest that the observed alternatively spliced isoforms of peritrophic matrix protein may be associated with eCry3.1Ab resistance in western corn rootworm.


2015 ◽  
Vol 28 (3) ◽  
pp. 298-309 ◽  
Author(s):  
Alyssa Burkhardt ◽  
Alex Buchanan ◽  
Jason S. Cumbie ◽  
Elizabeth A. Savory ◽  
Jeff H. Chang ◽  
...  

Pseudoperonospora cubensis is an obligate pathogen and causative agent of cucurbit downy mildew. To help advance our understanding of the pathogenicity of P. cubensis, we used RNA-Seq to improve the quality of its reference genome sequence. We also characterized the RNA-Seq dataset to inventory transcript isoforms and infer alternative splicing during different stages of its development. Almost half of the original gene annotations were improved and nearly 4,000 previously unannotated genes were identified. We also demonstrated that approximately 24% of the expressed genome and nearly 55% of the intron-containing genes from P. cubensis had evidence for alternative splicing. Our analyses revealed that intron retention is the predominant alternative splicing type in P. cubensis, with alternative 5′- and alternative 3′-splice sites occurring at lower frequencies. Representatives of the newly identified genes and predicted alternatively spliced transcripts were experimentally validated. The results presented herein highlight the utility of RNA-Seq for improving draft genome annotations and, through this approach, we demonstrate that alternative splicing occurs more frequently than previously predicted. In total, the current study provides evidence that alternative splicing plays a key role in transcriptome regulation and proteome diversification in plant-pathogenic oomycetes.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 457-457
Author(s):  
Govardhan Anande ◽  
Ashwin Unnikrishnan ◽  
Nandan Deshpande ◽  
Sylvain Mareschal ◽  
Aarif M. N. Batcha ◽  
...  

RNA splicing is a fundamental biological process that generates protein diversity from a finite set of genes. Recurrent somatic mutations of genes involved in RNA splicing are present at high frequency in Myelodysplasia (up to 70%) but less so in Acute Myeloid Leukemia (AML; less than 20%). To investigate whether there were aberrant and recurrent RNA splicing events in the AML transcriptome that were associated with poor prognosis in the absence of splicing factor mutations, we developed a bioinformatics pipeline to systematically annotate and quantify alternative splicing events from RNA-sequencing data (Fig A). We first analysed publicly available RNA-seq data from The Cancer Genome Atlas (TCGA, n=170). We focussed on non-M3 AML patients with no splicing factor mutations (based on reported genomic sequencing and verified by re-analysis of RNA-seq data from all patients) who had received intensive chemotherapy. We segregated these patients based on their European Leukaemia Net (ELN) risk classification and identified 1290 alternatively spliced events impacting 910 genes that were significantly different (FDR<0.05) between all ELNAdv (n=41) versus all ELNFav patients (n=21, Fig B). The majority were exon skipping events (716 events, 62%, Fig B-C), followed by intron retention (201 events, 15.6%, Fig B). We next used RNA-seq data from a second non-M3 AML patient cohort (ClinSeq- Sweden; ELNAdv, n=75 and ELNFav, n=47), detecting 2507 events mapping to 1566 genes. Comparing across the two cohorts, 222 shared genes were detected to be affected by alternative splicing (Fig D). Ingenuity pathway analysis associated these genes with pathways related to protein translation. In order to prioritise those alternatively spliced events most likely to have a deleterious function, we developed an analytical framework to predict their impact on protein structure (Fig E). 87 alternatively spliced events, 25.81% of the commonly shared splicing events, relating to 78 genes (35.13% of all genes) were predicted to directly alter highly conserved protein domains within the affected genes, leading to either a complete (~25%, Fig E) or a partial loss of a domain (20%, Fig E). These in silico predictions are likely to be an underestimate of the true impact, as splicing alterations mapping to poorly annotated domains or affecting the tertiary structure of proteins would be missed. A number of splicing factors themselves were differentially spliced, with the alternative splicing predicted to have functional consequences. This was exemplified by hnRNPA1, a factor with well-established roles in splicing, is itself alternatively spliced in patients and predicted to be deleterious. Consistent with this, motif scanning analyses indicated that a number of mis-spliced transcripts had hnRNPA1 binding motifs (Fig F). To assess the impact of these alternatively spliced events (that were predicted to also disrupt highly conserved protein domains) on the transcriptome, we simultaneously quantified differential gene expression. IPA analysis of the 602 genes that were differentially expressed between ELNAdv and ELNFav patients and shared between both TCGA and ClinSeq cohorts indicated that they were associated with pathways (Fig G) that were distinct from those associated with aberrantly spliced genes (Fig D). A number of pathways related to inflammation were enriched amongst the genes observed to be upregulated in ELNAdv patients (Fig G). Network analyses integrating the alternatively spliced genes with differentially expressed genes revealed strong interactions (Fig H), indicating functional associations between these biological events. Given these strong network interactions, we investigated the potential prognostic significance of these alternatively spliced events. To this end, we utilised machine-learning methods to derive a "splicing signature" of four mis-spliced genes with a predictive capacity equivalent to the ELN (Fig I). The splicing signature further refined existing risk prediction algorithms to improve the classification of patients (Fig J). Taken together, we report the presence of extensive deregulation of RNA splicing in AML patients even in the absence of splicing factor mutations. Many of these events were shared in patients with adverse outcomes and their impact on the AML transcriptome points towards vulnerabilities that could be targeted. Figure Disclosures Unnikrishnan: Celgene: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding. Lehmann:TEVA: Consultancy, Membership on an entity's Board of Directors or advisory committees; Pfizer: Membership on an entity's Board of Directors or advisory committees; Abbive: Membership on an entity's Board of Directors or advisory committees. Pimanda:Celgene: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding.


2018 ◽  
Author(s):  
Nowlan H. Freese ◽  
April R. Estrada ◽  
Ivory C. Blakley ◽  
Jinjie Duan ◽  
Ann E. Loraine

ABSTRACTAlternatively spliced genes produce multiple spliced isoforms, called transcript variants. In differential alternative splicing, transcript variant abundance differs across sample types. Differential alternative splicing is common in animal systems and influences cellular development in many processes, but its extent and significance is not as well known in plants. To investigate alternative splicing in plants, we examined RNA-Seq data from rice seedlings. The data included three biological replicates per sample type, approximately 30 million sequence alignments per replicate, and four sample types: roots and shoots treated with exogenous cytokinin delivered hydroponically or a mock treatment. Cytokinin treatment triggered expression changes in thousands of genes but had negligible effect on splicing patterns. However, many genes were alternatively spliced between mock-treated roots and shoots, indicating that our methods were sufficiently sensitive to detect differential splicing in a data set. Quantitative fragment analysis of reverse transcriptase-PCR products made from newly prepared rice samples confirmed nine of ten differential splicing events between rice roots and shoots. Differential alternative splicing typically changed the relative abundance of splice variants that co-occurred in a data set. Analysis of a similar (but less deeply sequenced) RNA-Seq data set from Arabidopsis showed the same pattern. In both the Arabidopsis and rice RNA-Seq data sets, most genes annotated as alternatively spliced had small minor variant frequencies. Of splicing choices with abundant support for minor forms, most alternative splicing events were located within the protein-coding sequence and maintained the annotated reading frame. A tool for visualizing protein annotations in the context of genomic sequence (ProtAnnot) together with a genome browser (Integrated Genome Browser) were used to visualize and assess effects of differential splicing on gene function. In general, differentially spliced regions coincided with conserved regions in the encoded proteins, indicating that differential alternative splicing is likely to affect protein function between root and shoot tissue in rice.


2019 ◽  
Author(s):  
Carlos Martí-Gómez ◽  
Enrique Lara-Pezzi ◽  
Fátima Sánchez-Cabo

Alternative splicing (AS) is an important mechanism in the generation of transcript diversity across mammals. AS patterns are dynamically regulated during development and in response to environmental changes. Defects or perturbations in its regulation may lead to cancer or neurological disorders, among other pathological conditions. The regulatory mechanisms controlling AS in a given biological context are typically inferred using a two step-framework: differential AS analysis followed by enrichment methods. These strategies require setting rather arbitrary thresholds and are prone to error propagation along the analysis. To overcome these limitations, we propose dSreg, a Bayesian model that integrates RNAseq with data from regulatory features, e.g. binding sites of RNA binding proteins (RBPs). dSreg identifies the key underlying regulators controlling AS changes and quantifies their activity while simultaneously estimating the changes in exon inclusion rates. dSreg increased both the sensitivity and the specificity of the identified alternative splicing changes in simulated data, even at low read coverage. dSreg also showed improved performance when analyzing a collection of knock-down RBPs experiments from ENCODE, as opposed to traditional enrichment methods such as Over-representation Analysis (ORA) and Gene Set Enrichment Analysis (GSEA). dSreg opens the possibility to integrate a large amount of readily available RNA-seq datasets at low coverage for AS analysis and allows more cost-effective RNA-seq experiments. dSreg was implemented in python using stan and is freely available to the community at https://bitbucket.org/cmartiga/dsreg.


2020 ◽  
Vol 21 (S9) ◽  
Author(s):  
Fan Zhang ◽  
Chris K. Deng ◽  
Mu Wang ◽  
Bin Deng ◽  
Robert Barber ◽  
...  

Abstract Background Alternative splicing isoforms have been reported as a new and robust class of diagnostic biomarkers. Over 95% of human genes are estimated to be alternatively spliced as a powerful means of producing functionally diverse proteins from a single gene. The emergence of next-generation sequencing technologies, especially RNA-seq, provides novel insights into large-scale detection and analysis of alternative splicing at the transcriptional level. Advances in Proteomic Technologies such as liquid chromatography coupled tandem mass spectrometry (LC–MS/MS), have shown tremendous power for the parallel characterization of large amount of proteins in biological samples. Although poor correspondence has been generally found from previous qualitative comparative analysis between proteomics and microarray data, significantly higher degrees of correlation have been observed at the level of exon. Combining protein and RNA data by searching LC–MS/MS data against a customized protein database from RNA-Seq may produce a subset of alternatively spliced protein isoform candidates that have higher confidence. Results We developed a bioinformatics workflow to discover alternative splicing biomarkers from LC–MS/MS using RNA-Seq. First, we retrieved high confident, novel alternative splicing biomarkers from the breast cancer RNA-Seq database. Then, we translated these sequences into in silico Isoform Junction Peptides, and created a customized alternative splicing database for MS searching. Lastly, we ran the Open Mass spectrometry Search Algorithm against the customized alternative splicing database with breast cancer plasma proteome. Twenty six alternative splicing biomarker peptides with one single intron event and one exon skipping event were identified. Further interpretation of biological pathways with our Integrated Pathway Analysis Database showed that these 26 peptides are associated with Cancer, Signaling, Metabolism, Regulation, Immune System and Hemostasis pathways, which are consistent with the 256 alternative splicing biomarkers from the RNA-Seq. Conclusions This paper presents a bioinformatics workflow for using RNA-seq data to discover novel alternative splicing biomarkers from the breast cancer proteome. As a complement to synthetic alternative splicing database technique for alternative splicing identification, this method combines the advantages of two platforms: mass spectrometry and next generation sequencing and can help identify potentially highly sample-specific alternative splicing isoform biomarkers at early-stage of cancer.


2016 ◽  
Author(s):  
Juan L. Trincado ◽  
Juan C. Entizne ◽  
Gerald Hysenaj ◽  
Babita Singh ◽  
Miha Skalic ◽  
...  

AbstractDespite the many approaches to study differential splicing from RNA-seq, many challenges remain unsolved, including computing capacity and sequencing depth requirements. Here we present SUPPA2, a new method for differential splicing analysis that addresses these challenges and enables streamlined analysis across multiple conditions taking into account biological variability. Using experimental and simulated data SUPPA2 achieves higher accuracy compared to other methods; especially at low sequencing depth and short read length, with important implications for cost-effective use of RNA-seq for splicing; and was able to identify novel Transformer2-regulated exons. We further analyzed two differentiation series to support the applicability of SUPPA2 beyond binary comparisons. This identified clusters of alternative splicing events enriched in microexons induced during differentiation of bipolar neurons, and a cluster enriched in intron retention events that are present at late stages during erythroblast differentiation. Our data suggest that SUPPA2 is a valuable tool for the robust investigation of the biological complexity of alternative splicing.


2017 ◽  
Author(s):  
Dries Vaneechoutte ◽  
April R. Estrada ◽  
Ying-Chen Lin ◽  
Ann E. Loraine ◽  
Klaas Vandepoele

SUMMARYAlternative splicing and the usage of alternate transcription start- or stop sites allows a single gene to produce multiple transcript isoforms. Most plant genes express certain isoforms at a significantly higher level than others, but under specific conditions this expression dominance can change, resulting in a different set of dominant isoforms. These events of Differential Transcript Usage (DTU) have been observed for thousands of Arabidopsis thaliana, Zea mays and Vitis vinifera genes and have been linked to development and stress response. However, the characteristics of these genes, nor the implications of DTU on their protein coding sequences or functions, are currently well understood. Here we present a dataset of isoform dominance and DTU for all genes in the AtRTD2 reference transcriptome based on a protocol that was benchmarked on simulated data and validated through comparison with a published RT-PCR panel. We report DTU events for 8,148 genes across 206 public RNA-Seq samples and find that protein sequences are affected in 22% of the cases. The observed DTU events show high consistency across replicates and reveal reproducible patterns in response to treatment and development. We also demonstrate that genes with different evolutionary ages, expression breadths, and functions show large differences in the frequency at which they undergo DTU and in the effect that these events have on their protein sequences. Finally, we showcase how the generated dataset can be used to explore DTU events for genes of interest or to find genes with specific DTU in samples of interest.SIGNIFICANCE STATEMENTDifferential transcript usage through alternative splicing has been reported for thousands of genes in plants, yet genome-wide datasets to study the implications for gene functions are thus far not available. Here we present the first reference dataset of isoform dominance and differential transcript usage for Arabidopsis thaliana based on 206 public RNA-Seq samples and provide insights in the occurrence and functional consequences of alternative splicing.


2019 ◽  
Vol 20 (11) ◽  
pp. 2685 ◽  
Author(s):  
Qi Song ◽  
Fang Lv ◽  
Muhammad Tahir ul Qamar ◽  
Feng Xing ◽  
Run Zhou ◽  
...  

Micro-exons are a kind of exons with lengths no more than 51 nucleotides. They are generally ignored in genome annotation due to the short length, whereas recent studies indicate that they have special splicing properties and important functions. Considering that there has been no genome-wide study of micro-exons in plants up to now, we screened and analyzed genes containing micro-exons in two indica rice varieties in this study. According to the annotation of Zhenshan 97 (ZS97) and Minghui 63 (MH63), ~23% of genes possess micro-exons. We then identified micro-exons from RNA-seq data and found that >65% micro-exons had been annotated and most of novel micro-exons were located in gene regions. About 60% micro-exons were constitutively spliced, and the others were alternatively spliced in different tissues. Besides, we observed that approximately 54% of genes harboring micro-exons tended to be ancient genes, and 13% were Oryza genus-specific. Micro-exon genes were highly conserved in Oryza genus with consistent domains. In particular, the predicted protein structures showed that alternative splicing of in-frame micro-exons led to a local structural recombination, which might affect some core structure of domains, and alternative splicing of frame-shifting micro-exons usually resulted in premature termination of translation by introducing a stop codon or missing functional domains. Overall, our study provided the genome-wide distribution, evolutionary conservation, and potential functions of micro-exons in rice.


2019 ◽  
Author(s):  
Patricia Sieber ◽  
Emanuel Barth ◽  
Manja Marz

ABSTRACTAging is characterized by a decline of cellular homeostasis over time, leading to various severe disorders and death. Alternative splicing is an important regulatory level of gene expression and thus takes on a key role in the maintenance of accurate cell and tissue function. Missplicing of certain genes has already been linked to several age-associated diseases, such as progeria, Alzheimer’s disease, Parkinson’s disease and cancer. Nevertheless, many studies focus only on transcriptional variations of single genes or the expression changes of spliceosomal genes, coding for the proteins that aggregate to the spliceosomal machinery. Little is known on the general change of present and switching isoforms in different tissues over time. In this descriptive RNA-Seq study, we report differences and commonalities of isoform usage during aging among different tissues within one species and compare changes of alternative splicing among different, evolutionarily distinct species. Although we identified a multitude of differntially spliced genes among different time points, we observed little to no general changes in the transcriptomic landscape of the investigated samples. Although there is undoubtedly considerable influence of specifically spliced genes on certain age-associated processes, this work shows that alternative splicing remains stable for the majority of genes with aging.


Sign in / Sign up

Export Citation Format

Share Document