scholarly journals RNAtor: an Android-based application for biologists to plan RNA sequencing experiments

F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 997
Author(s):  
Shruti Kane ◽  
Himanshu Garg ◽  
Neeraja M. Krishnan ◽  
Aditya Singh ◽  
Binay Panda

RNA sequencing (RNA-seq) is a powerful technology that identifies novel transcripts (coding, non-coding and splice variants), understands transcript structures, and estimates gene/allele expression. Biologists face specific challenges while designing RNA-seq experiments. The nature of these challenges lies in determining the total number of sequenced reads and replicates required for detecting marginally differentially expressed transcripts, and in determining the adequate number of lanes to use in a sequencing flow cell. Despite previous attempts to address these challenges, easily accessible and biologist-friendly mobile applications do not exist. Thus, we developed RNAtor, a mobile application for Android platforms, to aid biologists in correctly designing their RNA-seq experiments. The recommendations from RNAtor are based on simulations and real data.

F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 997
Author(s):  
Shruti Kane ◽  
Himanshu Garg ◽  
Neeraja M. Krishnan ◽  
Aditya Singh ◽  
Binay Panda

RNA sequencing (RNA-seq) is a powerful technology that allows one to assess the RNA levels in a sample. Analysis of these levels can help in identifying novel transcripts (coding, non-coding and splice variants), understanding transcript structures, and estimating gene/allele expression. Biologists face specific challenges while designing RNA-seq experiments. The nature of these challenges lies in determining the total number of sequenced reads and technical replicates required for detecting marginally differentially expressed transcripts. Despite previous attempts to address these challenges, easily-accessible and biologist-friendly mobile applications do not exist. Thus, we developed RNAtor, a mobile application for Android platforms, to aid biologists in correctly designing their RNA-seq experiments. The recommendations from RNAtor are based on simulations and real data.


2016 ◽  
Author(s):  
Shruti Kane ◽  
Himanshu Garg ◽  
Neeraja M. Krishnan ◽  
Aditya Singh ◽  
Binay Panda

AbstractRNA sequencing (RNA-seq) is a powerful technology for identification of novel transcripts (coding, non-coding and splice variants), understanding of transcript structures and estimation of gene and/or allelic expression. There are specific challenges that biologists face in determining the number of replicates to use, total number of sequencing reads to generate for detecting marginally differentially expressed transcripts and the number of lanes in a sequencing flow cell to use for the production of right amount of information. Although past studies attempted answering some of these questions, there is a lack of accessible and biologist-friendly mobile applications to answer these questions. Keeping this in mind, we have developed RNAtor, a mobile application for Android platforms, to aid biologists in correctly designing their RNA-seq experiments. The recommendations from RNAtor are based on simulations and real data.Availability and ImplementationThe Android version of RNAtor is available on Google Play Store and the code from GitHub (https://github.com/binaypanda/RNAtor).


2015 ◽  
Vol 90 (3) ◽  
pp. 1278-1289 ◽  
Author(s):  
Catrin Stutika ◽  
Andreas Gogol-Döring ◽  
Laura Botschen ◽  
Mario Mietzsch ◽  
Stefan Weger ◽  
...  

ABSTRACTAdeno-associated virus (AAV) is recognized for its bipartite life cycle with productive replication dependent on coinfection with adenovirus (Ad) and AAV latency being established in the absence of a helper virus. The shift from latent to Ad-dependent AAV replication is mostly regulated at the transcriptional level. The current AAV transcription map displays highly expressed transcripts as found upon coinfection with Ad. So far, AAV transcripts have only been characterized on the plus strand of the AAV single-stranded DNA genome. The AAV minus strand is assumed not to be transcribed. Here, we apply Illumina-based RNA sequencing (RNA-Seq) to characterize the entire AAV2 transcriptome in the absence or presence of Ad. We find known and identify novel AAV transcripts, including additional splice variants, the most abundant of which leads to expression of a novel 18-kDa Rep/VP fusion protein. Furthermore, we identify for the first time transcription on the AAV minus strand with clustered reads upstream of the p5 promoter, confirmed by 5ˈ rapid amplification of cDNA ends and RNase protection assays. The p5 promoter displays considerable activity in both directions, a finding indicative of divergent transcription. Upon infection with AAV alone, low-level transcription of both AAV strands is detectable and is strongly stimulated upon coinfection with Ad.IMPORTANCENext-generation sequencing (NGS) allows unbiased genome-wide analyses of transcription profiles, used here for an in depth analysis of the AAV2 transcriptome during latency and productive infection. RNA-Seq analysis led to the discovery of novel AAV transcripts and splice variants, including a derived, novel 18-kDa Rep/VP fusion protein. Unexpectedly, transcription from the AAV minus strand was discovered, indicative of divergent transcription from the p5 promoter. This finding opens the door for novel concepts of the switch between AAV latency and productive replication. In the absence of a suitable animal model to study AAVin vivo, combinedin cellulaeandin silicostudies will help to forward the understanding of the unique, bipartite AAV life cycle.


2012 ◽  
Vol 111 (suppl_1) ◽  
Author(s):  
Emma L Robinson ◽  
Syed Haider ◽  
Hillary Hei ◽  
Richard T Lee ◽  
Roger S Foo

Heart failure comprises of clinically distinct inciting causes but a consistent pattern of change in myocardial gene expression supports the hypothesis that unifying biochemical mechanisms underlie disease progression. The recent RNA-seq revolution has enabled whole transcriptome profiling, using deep-sequencing technologies. Up to 70% of the genome is now known to be transcribed into RNA, a significant proportion of which is long non-coding RNAs (lncRNAs), defined as polyribonucleotides of ≥200 nucleotides. This project aims to discover whether the myocardium expression of lncRNAs changes in the failing heart. Paired end RNA-seq from a 300-400bp library of ‘stretched’ mouse myocyte total RNA was carried out to generate 76-mer sequence reads. Mechanically stretching myocytes with equibiaxial stretch apparatus mimics pathological hypertrophy in the heart. Transcripts were assembled and aligned to reference genome mm9 (UCSC), abundance determined and differential expression of novel transcripts and alternative splice variants were compared with that of control (non-stretched) mouse myocytes. Five novel transcripts have been identified in our RNA-seq that are differentially expressed in stretched myocytes compared with non-stretched. These are regions of the genome that are currently unannotated and potentially are transcribed into non-coding RNAs. Roles of known lncRNAs include control of gene expression, either by direct interaction with complementary regions of the genome or association with chromatin remodelling complexes which act on the epigenome.Changes in expression of genes which contribute to the deterioration of the failing heart could be due to the actions of these novel lncRNAs, immediately suggesting a target for new pharmaceuticals. Changes in the expression of these novel transcripts will be validated in a larger sample size of stretched myocytes vs non-stretched myocytes as well as in the hearts of transverse aortic constriction (TAC) mice vs Sham (surgical procedure without the aortic banding). In vivo investigations will then be carried out, using siLNA antisense technology to silence novel lncRNAs in mice.


2021 ◽  
Vol 23 (Supplement_6) ◽  
pp. vi9-vi9
Author(s):  
Min Kyung Lee ◽  
Nasim Azizgolshani ◽  
Fred Kolling ◽  
Lananh Nguyen ◽  
George Zanazzi ◽  
...  

Abstract Identifying transcriptomic alterations in pediatric central nervous system (pCNS) tumors often relies on transcriptomic profiles from bulk tissue RNA-sequencing that can be confounded by varying cell type proportions across tumor and normal brain tissues. We utilized single nuclei RNA-sequencing (snRNA-seq) and bulk RNA-seq in 33 pCNS tumors and 3 non-diseased pediatric brain tissue samples collected from the Norris Cotton Cancer Center to identify variation in gene expression in bulk tissue attributed to overrepresentation of specific cell-type populations when determining differentially expressed genes comparing pCNS tumors to normal pediatric brain tissues. snRNA-seq of 43,515 nuclei (mean = 1,209 nuclei/sample) revealed large proportions of astrocytes (median = 0.45, range = 0.24–0.93) and oligodendrocytes (median = 0.37, range = 0.00–0.66) in pCNS tumors. Compared to normal pediatric brain, proportions of astrocytes were significantly higher (P = 9.2E-03) and neurons were significantly lower (P = 9.4E-03) in pCNS tumors. Differential expression analyses comparing bulk RNA-sequencing data from pCNS tumors to normal pediatric brain identified 902 additional differentially expressed genes (# DE genes = 1,802) when adjusting for astrocyte and neuron proportions compared with unadjusted analysis (# DE genes = 900). In cell-type proportion unadjusted analysis, top DE genes included astrocyte-specific markers, GFAP and CIITA, both of which were found to be not significantly differentially expressed in cell-type proportion adjusted analysis. Indeed, pathways enrichment analysis revealed DE genes in unadjusted models were associated with processes of the neurons and astrocytes such as interferon signaling and postsynaptic signal transmission. After adjustment for astrocyte and neuron proportions, DE genes were associated with defensins and DNA replication-related processes. Our results highlight new potential biological pathways essential in pCNS tumors and indicate the significance of the distribution of varying cell types in tissue samples when conducting studies to investigate transcriptomic alterations in bulk tissue of pCNS tumors.


Author(s):  
Paul L. Auer ◽  
Rebecca W Doerge

RNA sequencing technology is providing data of unprecedented throughput, resolution, and accuracy. Although there are many different computational tools for processing these data, there are a limited number of statistical methods for analyzing them, and even fewer that acknowledge the unique nature of individual gene transcription. We introduce a simple and powerful statistical approach, based on a two-stage Poisson model, for modeling RNA sequencing data and testing for biologically important changes in gene expression. The advantages of this approach are demonstrated through simulations and real data applications.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 4582-4582
Author(s):  
Wei Liao ◽  
Gwen Jordaan ◽  
Artur Jaroszewicz ◽  
Matteo Pellegrini ◽  
Sanjai Sharma

Abstract Abstract 4582 High throughput sequencing of cellular mRNA provides a comprehensive analysis of the transcriptome. Besides identifying differentially expressed genes in different cell types, it also provides information of mRNA isoforms and splicing alterations. We have analyzed two CLL specimens and a normal peripheral blood B cells mRNA by this approach and performed data analysis to identify differentially expressed and spliced genes. The result showed CLLs specimens express approximately 40% more transcripts compared to normal B cells. The FPKM data (fragment per kilobase of exon per million) revealed a higher transcript expression on chromosome 12 in CLL#1 indicating the presence of trisomy 12, which was confirmed by fluorescent in-situ hybridization assay. With a two-fold change in FPKM as a cutoff and a p value cutoff of 0.05 as compared to the normal B cell control, 415 genes and 174 genes in CLL#1 and 676 and 235 genes in CLL#2 were up and downregulated or differentially expressed. In these two CLL specimens, 45% to 75% of differentially expressed genes are common to both the CLL specimens indicating that genetically disparate CLL specimens have a high percentage of a core set of genes that are potentially important for CLL biology. Selected differentially expressed genes with increased expression (selectin P ligand, SELPLG, and adhesion molecule interacts with CXADR antigen 1, AMICA) and decreased (Fos, Jun, CD69 and Rhob) expression based on the FPKM from RNA-sequencing data were also analyzed in additional CLL specimens by real time PCR analysis. The expression data from RNA-seq closely matches the fold-change in expression as measured by RT-PCR analysis and confirms the validity of the RNA-seq analysis. Interestingly, Fos was identified as one of the most downregulated gene in CLL. Using the Cufflinks and Cuffdiff software, the splicing patterns of genes in CLL specimens and normal B cells were analyzed. Approximately, 1100 to 1250 genes in the two CLL specimens were significantly differentially spliced as compared to normal B cells. In this analysis as well, there is a core set of 800 common genes which are differentially spliced in the two CLL specimens. The RNA-sequencing analysis accurately identifies differentially expressed novel genes and splicing variations that will help us understand the biology of CLL. Disclosures: No relevant conflicts of interest to declare.


2015 ◽  
Vol 2015 ◽  
pp. 1-5 ◽  
Author(s):  
Yuxiang Tan ◽  
Yann Tambouret ◽  
Stefano Monti

The performance evaluation of fusion detection algorithms from high-throughput sequencing data crucially relies on the availability of data with known positive and negative cases of gene rearrangements. The use of simulated data circumvents some shortcomings of real data by generation of an unlimited number of true and false positive events, and the consequent robust estimation of accuracy measures, such as precision and recall. Although a few simulated fusion datasets from RNA Sequencing (RNA-Seq) are available, they are of limited sample size. This makes it difficult to systematically evaluate the performance of RNA-Seq based fusion-detection algorithms. Here, we present SimFuse to address this problem. SimFuse utilizes real sequencing data as the fusions’ background to closely approximate the distribution of reads from a real sequencing library and uses a reference genome as the template from which to simulate fusions’ supporting reads. To assess the supporting read-specific performance, SimFuse generates multiple datasets with various numbers of fusion supporting reads. Compared to an extant simulated dataset, SimFuse gives users control over the supporting read features and the sample size of the simulated library, based on which the performance metrics needed for the validation and comparison of alternative fusion-detection algorithms can be rigorously estimated.


Reproduction ◽  
2017 ◽  
Vol 153 (1) ◽  
pp. 35-48 ◽  
Author(s):  
Ru Zheng ◽  
Yue Li ◽  
Huiying Sun ◽  
Xiaoyin Lu ◽  
Bao-Fa Sun ◽  
...  

The syncytiotrophoblast (STB) plays a key role in maintaining the function of the placenta during human pregnancy. However, the molecular network that orchestrates STB development remains elusive. The aim of this study was to obtain broad and deep insight into human STB formation via transcriptomics. We adopted RNA sequencing (RNA-Seq) to investigate genes and isoforms involved in forskolin (FSK)-induced fusion of BeWo cells. BeWo cells were treated with 50 μM FSK or dimethyl sulfoxide (DMSO) as a vehicle control for 24 and 48 h, and the mRNAs at 0, 24 and 48 h were sequenced. We detected 28,633 expressed genes and identified 1902 differentially expressed genes (DEGs) after FSK treatment for 24 and 48 h. Among the 1902 DEGs, 461 were increased and 395 were decreased at 24 h, whereas 879 were upregulated and 763 were downregulated at 48 h. When the 856 DEGs identified at 24 h were traced individually at 48 h, they separated into 6 dynamic patterns via a K-means algorithm, and most were enriched in down–even and up–even patterns. Moreover, the gene ontology (GO) terms syncytium formation, cell junction assembly, cell fate commitment, calcium ion transport, regulation of epithelial cell differentiation and cell morphogenesis involved in differentiation were clustered, and the MAPK pathway was most significantly regulated. Analyses of alternative splicing isoforms detected 123,200 isoforms, of which 1376 were differentially expressed. The present deep analysis of the RNA-Seq data of BeWo cell fusion provides important clues for understanding the mechanisms underlying human STB formation.


2016 ◽  
Vol 14 (06) ◽  
pp. 1650034 ◽  
Author(s):  
Naim Al Mahi ◽  
Munni Begum

One of the primary objectives of ribonucleic acid (RNA) sequencing or RNA-Seq experiment is to identify differentially expressed (DE) genes in two or more treatment conditions. It is a common practice to assume that all read counts from RNA-Seq data follow overdispersed (OD) Poisson or negative binomial (NB) distribution, which is sometimes misleading because within each condition, some genes may have unvarying transcription levels with no overdispersion. In such a case, it is more appropriate and logical to consider two sets of genes: OD and non-overdispersed (NOD). We propose a new two-step integrated approach to distinguish DE genes in RNA-Seq data using standard Poisson and NB models for NOD and OD genes, respectively. This is an integrated approach because this method can be merged with any other NB-based methods for detecting DE genes. We design a simulation study and analyze two real RNA-Seq data to evaluate the proposed strategy. We compare the performance of this new method combined with the three [Formula: see text]-software packages namely edgeR, DESeq2, and DSS with their default settings. For both the simulated and real data sets, integrated approaches perform better or at least equally well compared to the regular methods embedded in these [Formula: see text]-packages.


Sign in / Sign up

Export Citation Format

Share Document