scholarly journals SPLICE-q: a Python tool for genome-wide quantification of splicing efficiency

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Verônica R. de Melo Costa ◽  
Julianus Pfeuffer ◽  
Annita Louloupi ◽  
Ulf A. V. Ørom ◽  
Rosario M. Piro

Abstract Background Introns are generally removed from primary transcripts to form mature RNA molecules in a post-transcriptional process called splicing. An efficient splicing of primary transcripts is an essential step in gene expression and its misregulation is related to numerous human diseases. Thus, to better understand the dynamics of this process and the perturbations that might be caused by aberrant transcript processing it is important to quantify splicing efficiency. Results Here, we introduce SPLICE-q, a fast and user-friendly Python tool for genome-wide SPLICing Efficiency quantification. It supports studies focusing on the implications of splicing efficiency in transcript processing dynamics. SPLICE-q uses aligned reads from strand-specific RNA-seq to quantify splicing efficiency for each intron individually and allows the user to select different levels of restrictiveness concerning the introns’ overlap with other genomic elements such as exons of other genes. We applied SPLICE-q to globally assess the dynamics of intron excision in yeast and human nascent RNA-seq. We also show its application using total RNA-seq from a patient-matched prostate cancer sample. Conclusions Our analyses illustrate that SPLICE-q is suitable to detect a progressive increase of splicing efficiency throughout a time course of nascent RNA-seq and it might be useful when it comes to understanding cancer progression beyond mere gene expression levels. SPLICE-q is available at: https://github.com/vrmelo/SPLICE-q

2020 ◽  
Author(s):  
Verônica R Melo Costa ◽  
Julianus Pfeuffer ◽  
Annita Louloupi ◽  
Ulf A V Ørom ◽  
Rosario M Piro

AbstractBackgroundIntrons are generally removed from primary transcripts to form mature RNA molecules in a post-transcriptional process called splicing. An efficient splicing of primary transcripts is an essential step in gene expression and its misregulation is related to numerous human diseases. Thus, to better understand the dynamics of this process and the perturbations that might be caused by aberrant transcript processing it is important to quantify splicing efficiency.ResultsHere, we introduce SPLICE-q, a fast and user-friendly Python tool for genome-wide SPLICing Efficiency quantification. It supports studies focusing on the implications of splicing efficiency in transcript processing dynamics. SPLICE-q uses aligned reads from strand-specific RNA-seq to quantify splicing efficiency for each intron individually and allows the user to select different levels of restrictiveness concerning the introns’ overlap with other genomic elements such as exons of other genes. We applied SPLICE-q to globally assess the dynamics of intron excision in yeast and human nascent RNA-seq. We also show its application using total RNA-seq from a patient-matched prostate cancer sample.ConclusionsOur analyses illustrate that SPLICE-q is suitable to detect a progressive increase of splicing efficiency throughout a time course of nascent RNA-seq and it might be useful when it comes to understanding cancer progression beyond mere gene expression levels. SPLICE-q is available at: https://github.com/vrmelo/SPLICE-q


2021 ◽  
Author(s):  
Dennis A Sun ◽  
Nipam H Patel

AbstractEmerging research organisms enable the study of biology that cannot be addressed using classical “model” organisms. The development of novel data resources can accelerate research in such animals. Here, we present new functional genomic resources for the amphipod crustacean Parhyale hawaiensis, facilitating the exploration of gene regulatory evolution using this emerging research organism. We use Omni-ATAC-Seq, an improved form of the Assay for Transposase-Accessible Chromatin coupled with next-generation sequencing (ATAC-Seq), to identify accessible chromatin genome-wide across a broad time course of Parhyale embryonic development. This time course encompasses many major morphological events, including segmentation, body regionalization, gut morphogenesis, and limb development. In addition, we use short- and long-read RNA-Seq to generate an improved Parhyale genome annotation, enabling deeper classification of identified regulatory elements. We leverage a variety of bioinformatic tools to discover differential accessibility, predict nucleosome positioning, infer transcription factor binding, cluster peaks based on accessibility dynamics, classify biological functions, and correlate gene expression with accessibility. Using a Minos transposase reporter system, we demonstrate the potential to identify novel regulatory elements using this approach, including distal regulatory elements. This work provides a platform for the identification of novel developmental regulatory elements in Parhyale, and offers a framework for performing such experiments in other emerging research organisms.Primary Findings-Omni-ATAC-Seq identifies cis-regulatory elements genome-wide during crustacean embryogenesis-Combined short- and long-read RNA-Seq improves the Parhyale genome annotation-ImpulseDE2 analysis identifies dynamically regulated candidate regulatory elements-NucleoATAC and HINT-ATAC enable inference of nucleosome occupancy and transcription factor binding-Fuzzy clustering reveals peaks with distinct accessibility and chromatin dynamics-Integration of accessibility and gene expression reveals possible enhancers and repressors-Omni-ATAC can identify known and novel regulatory elements


2021 ◽  
Vol 8 ◽  
Author(s):  
Yiming Yan ◽  
Huihua Zhang ◽  
Shuang Gao ◽  
Huanmin Zhang ◽  
Xinheng Zhang ◽  
...  

Background: Avian leukosis virus subgroup J (ALV-J) is an oncogenic virus that causes serious economic losses in the poultry industry; unfortunately, there is no effective vaccine against ALV-J. DNA methylation plays a crucial role in several biological processes, and an increasing number of diseases have been proven to be related to alterations in DNA methylation. In this study, we screened ALV-J-positive and -negative chickens. Subsequently, we generated and provided the genome-wide gene expression and DNA methylation profiles by MeDIP-seq and RNA-seq of ALV-J-positive and -negative chicken samples; 8,304 differentially methylated regions (DMRs) were identified by MeDIP-seq analysis (p ≤ 0.005) and 515 differentially expressed genes were identified by RNA-seq analysis (p ≤ 0.05). As a result of an integration analysis, we screened six candidate genes to identify ALV-J-negative chickens that possessed differential methylation in the promoter region. Furthermore, TGFB2 played an important role in tumorigenesis and cancer progression, which suggested TGFB2 may be an indicator for identifying ALV-J infections.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Christian Escoto-Sandoval ◽  
Alan Flores-Díaz ◽  
M. Humberto Reyes-Valdés ◽  
Neftalí Ochoa-Alejo ◽  
Octavio Martínez

AbstractRNA-Seq experiments allow genome-wide estimation of relative gene expression. Estimation of gene expression at different time points generates time expression profiles of phenomena of interest, as for example fruit development. However, such profiles can be complex to analyze and interpret. We developed a methodology that transforms original RNA-Seq data from time course experiments into standardized expression profiles, which can be easily interpreted and analyzed. To exemplify this methodology we used RNA-Seq data obtained from 12 accessions of chili pepper (Capsicum annuum L.) during fruit development. All relevant data, as well as functions to perform analyses and interpretations from this experiment, were gathered into a publicly available R package: “Salsa”. Here we explain the rational of the methodology and exemplify the use of the package to obtain valuable insights into the multidimensional time expression changes that occur during chili pepper fruit development. We hope that this tool will be of interest for researchers studying fruit development in chili pepper as well as in other angiosperms.


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0250013
Author(s):  
Chia-Hsin Hsu ◽  
Hirotaka Tomiyasu ◽  
Chi-Hsun Liao ◽  
Chen-Si Lin

Doxorubicin resistance is a major challenge in the successful treatment of canine diffuse large B-cell lymphoma (cDLBCL). In the present study, MethylCap-seq and RNA-seq were performed to characterize the genome-wide DNA methylation and differential gene expression patterns respectively in CLBL-1 8.0, a doxorubicin-resistant cDLBCL cell line, and in CLBL-1 as control, to investigate the underlying mechanisms of doxorubicin resistance in cDLBCL. A total of 20289 hypermethylated differentially methylated regions (DMRs) were detected. Among these, 1339 hypermethylated DMRs were in promoter regions, of which 24 genes showed an inverse correlation between methylation and gene expression. These 24 genes were involved in cell migration, according to gene ontology (GO) analysis. Also, 12855 hypermethylated DMRs were in gene-body regions. Among these, 353 genes showed a positive correlation between methylation and gene expression. Functional analysis of these 353 genes highlighted that TGF-β and lysosome-mediated signal pathways are significantly associated with the drug resistance of CLBL-1. The tumorigenic role of TGF-β signaling pathway in CLBL-1 8.0 was further validated by treating the cells with a TGF-β inhibitor(s) to show the increased chemo-sensitivity and intracellular doxorubicin accumulation, as well as decreased p-glycoprotein expression. In summary, the present study performed an integrative analysis of DNA methylation and gene expression in CLBL-1 8.0 and CLBL-1. The candidate genes and pathways identified in this study hold potential promise for overcoming doxorubicin resistance in cDLBCL.


2019 ◽  
Author(s):  
Wei Wang ◽  
Gang Ren ◽  
Ni Hong ◽  
Wenfei Jin

Abstract Background: CCCTC-Binding Factor (CTCF), also known as 11-zinc finger protein, participates in many cellular processes, including insulator activity, transcriptional regulation and organization of chromatin architecture. Based on single cell flow cytometry and single cell RNA-FISH analyses, our previous study showed that deletion of CTCF binding site led to a significantly increase of cellular variation of its target gene. However, the effect of CTCF on genome-wide landscape of cell-to-cell variation is unclear. Results: We knocked down CTCF in EL4 cells using shRNA, and conducted single cell RNA-seq on both wild type (WT) cells and CTCF-Knockdown (CTCF-KD) cells using Fluidigm C1 system. Principal component analysis of single cell RNA-seq data showed that WT and CTCF-KD cells concentrated in two different clusters on PC1, indicating gene expression profiles of WT and CTCF-KD cells were systematically different. Interestingly, GO terms including regulation of transcription, DNA binding, Zinc finger and transcription factor binding were significantly enriched in CTCF-KD-specific highly variable genes, indicating tissue-specific genes such as transcription factors were highly sensitive to CTCF level. The dysregulation of transcription factors potentially explain why knockdown of CTCF lead to systematic change of gene expression. In contrast, housekeeping genes such as rRNA processing, DNA repair and tRNA processing were significantly enriched in WT-specific highly variable genes, potentially due to a higher cellular variation of cell activity in WT cells compared to CTCF-KD cells. We further found cellular variation-increased genes were significantly enriched in down-regulated genes, indicating CTCF knockdown simultaneously reduced the expression levels and increased the expression noise of its regulated genes. Conclusions: To our knowledge, this is the first attempt to explore genome-wide landscape of cellular variation after CTCF knockdown. Our study not only advances our understanding of CTCF function in maintaining gene expression and reducing expression noise, but also provides a framework for examining gene function.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11875
Author(s):  
Tomoko Matsuda

Large volumes of high-throughput sequencing data have been submitted to the Sequencing Read Archive (SRA). The lack of experimental metadata associated with the data makes reuse and understanding data quality very difficult. In the case of RNA sequencing (RNA-Seq), which reveals the presence and quantity of RNA in a biological sample at any moment, it is necessary to consider that gene expression responds over a short time interval (several seconds to a few minutes) in many organisms. Therefore, to isolate RNA that accurately reflects the transcriptome at the point of harvest, raw biological samples should be processed by freezing in liquid nitrogen, immersing in RNA stabilization reagent or lysing and homogenizing in RNA lysis buffer containing guanidine thiocyanate as soon as possible. As the number of samples handled simultaneously increases, the time until the RNA is protected can increase. Here, to evaluate the effect of different lag times in RNA protection on RNA-Seq data, we harvested CHO-S cells after 3, 5, 6, and 7 days of cultivation, added RNA lysis buffer in a time course of 15, 30, 45, and 60 min after harvest, and conducted RNA-Seq. These RNA samples showed high RNA integrity number (RIN) values indicating non-degraded RNA, and sequence data from libraries prepared with these RNA samples was of high quality according to FastQC. We observed that, at the same cultivation day, global trends of gene expression were similar across the time course of addition of RNA lysis buffer; however, the expression of some genes was significantly different between the time-course samples of the same cultivation day; most of these differentially expressed genes were related to apoptosis. We conclude that the time lag between sample harvest and RNA protection influences gene expression of specific genes. It is, therefore, necessary to know not only RIN values of RNA and the quality of the sequence data but also how the experiment was performed when acquiring RNA-Seq data from the database.


Blood ◽  
2011 ◽  
Vol 118 (21) ◽  
pp. 2367-2367
Author(s):  
Mira Jeong ◽  
Deqiang Sun ◽  
Min Luo ◽  
Aysegul Ergen ◽  
Hongcang Gu ◽  
...  

Abstract Abstract 2367 Hematopoietic stem cell (HSC) Aging is a complex process linked to number of changes in gene expression and functional decline of self-renewal and differentiation potential. While epigenetic changes have been implicated in HSC aging, little direct evidence has been generated. DNA methylation is one of the major underlying mechanisms associated with the regulation of gene expression, but changes in DNA methylation patterns with HSC aging have not been characterized. We hypothesize that revealing the genome-wide DNA methylation and transcriptome signatures will lead to a greater understanding of HSC aging. Here, we report the first genome-scale study of epigenomic dynamics during normal mouse HSC aging. We isolated SP-KSL-CD150+ HSC populations from 4, 12, 24 month-old mouse bone marrow and carried out genome-wide reduced representative bisulfite sequencing (RRBS) and identified aging-associated differentially methylated CpGs. Three biological samples were sequenced from each aging group and we obtained 30–40 million high-quality reads with over 30X total coverage on ∼1.1M CpG sites which gives us adequate statistical power to infer methylation ratios. Bisulfite conversion rate of non-CpG cytosines was >99%. We analyzed a variety of genomic features to find that CpG island promoters, gene bodies, 5'UTRs, and 3'UTRs generally were associated with hypermethylation in aging HSCs. Overall, out of 1,777 differentially methylated CpGs, 92.8% showed age-related hypermethylation and 7.2% showed age-related hypomethylation. Gene ontology analyses have revealed that differentially methylated CpGs were significantly enriched near genes associated with alternative splicing, DNA binding, RNA-binding, transcription regulation, Wnt signaling and pathways in cancer. Most interestingly, over 579 splice variants were detected as candidates for age-related hypermethylation (86%) and hypomethylation (14%) including Dnmt3a, Runx1, Pbx1 and Cdkn2a. To quantify differentially expressed RNA-transcripts across the entire transcriptome, we performed RNA-seq and analyzed exon arrays. The Spearman's correlation between two different methods was good (r=0.80). From exon arrays, we identified 586 genes that were down regulated and 363 gene were up regulated with aging (p<0.001). Most interestingly, overall expression of DNA methyl transferases Dnmt1, Dnmt3a, Dnmt3b were down regulated with aging. We also found that Dnmt3a2, the short isoform of Dnmt3a, which lacks the N-terminal region of Dnmt3a and represents the major isoform in ES cells, is more expressed in young HSC. For the RNA-seq analysis, we focused first on annotated transcripts derived from cloned mRNAs and we found 307 genes were down regulated and 1015 gene were up regulated with aging (p<0.05). Secondly, we sought to identify differentially expressed isoforms and also novel transcribed regions (antisense and novel genes). To characterize the genes showing differential regulation, we analyzed their functional associations and observed that the highest scoring annotation cluster was enriched in genes associated with translation, the immune network and hematopoietic cell lineage. We expect that the results of these experiments will reveal the global effect of DNA methylation on transcript stability and the translational state of target genes. Our findings will lend insight into the molecular mechanisms responsible for the pathologic changes associated with aging in HSCs. Disclosures: No relevant conflicts of interest to declare.


Sign in / Sign up

Export Citation Format

Share Document