Using RNA-Seq to Discover Genetic Polymorphisms That Produce Hidden Splice Variants

Author(s):  
Shayna Stein ◽  
Emad Bahrami-Samani ◽  
Yi Xing
Genes ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 320
Author(s):  
Lorissa I. McDougall ◽  
Ryan M. Powell ◽  
Magdalena Ratajska ◽  
Chi F. Lynch-Sutherland ◽  
Sultana Mehbuba Hossain ◽  
...  

Melanoma comprises <5% of cutaneous malignancies, yet it causes a significant proportion of skin cancer-related deaths worldwide. While new therapies for melanoma have been developed, not all patients respond well. Thus, further research is required to better predict patient outcomes. Using long-range nanopore sequencing, RT-qPCR, and RNA sequencing analyses, we examined the transcription of BARD1 splice isoforms in melanoma cell lines and patient tissue samples. Seventy-six BARD1 mRNA variants were identified in total, with several previously characterised isoforms (γ, φ, δ, ε, and η) contributing to a large proportion of the expressed transcripts. In addition, we identified four novel splice events, namely, Δ(E3_E9), ▼(i8), IVS10+131▼46, and IVS10▼176, occurring in various combinations in multiple transcripts. We found that short-read RNA-Seq analyses were limited in their ability to predict isoforms containing multiple non-contiguous splicing events, as compared to long-range nanopore sequencing. These studies suggest that further investigations into the functional significance of the identified BARD1 splice variants in melanoma are warranted.


2015 ◽  
Vol 90 (3) ◽  
pp. 1278-1289 ◽  
Author(s):  
Catrin Stutika ◽  
Andreas Gogol-Döring ◽  
Laura Botschen ◽  
Mario Mietzsch ◽  
Stefan Weger ◽  
...  

ABSTRACTAdeno-associated virus (AAV) is recognized for its bipartite life cycle with productive replication dependent on coinfection with adenovirus (Ad) and AAV latency being established in the absence of a helper virus. The shift from latent to Ad-dependent AAV replication is mostly regulated at the transcriptional level. The current AAV transcription map displays highly expressed transcripts as found upon coinfection with Ad. So far, AAV transcripts have only been characterized on the plus strand of the AAV single-stranded DNA genome. The AAV minus strand is assumed not to be transcribed. Here, we apply Illumina-based RNA sequencing (RNA-Seq) to characterize the entire AAV2 transcriptome in the absence or presence of Ad. We find known and identify novel AAV transcripts, including additional splice variants, the most abundant of which leads to expression of a novel 18-kDa Rep/VP fusion protein. Furthermore, we identify for the first time transcription on the AAV minus strand with clustered reads upstream of the p5 promoter, confirmed by 5ˈ rapid amplification of cDNA ends and RNase protection assays. The p5 promoter displays considerable activity in both directions, a finding indicative of divergent transcription. Upon infection with AAV alone, low-level transcription of both AAV strands is detectable and is strongly stimulated upon coinfection with Ad.IMPORTANCENext-generation sequencing (NGS) allows unbiased genome-wide analyses of transcription profiles, used here for an in depth analysis of the AAV2 transcriptome during latency and productive infection. RNA-Seq analysis led to the discovery of novel AAV transcripts and splice variants, including a derived, novel 18-kDa Rep/VP fusion protein. Unexpectedly, transcription from the AAV minus strand was discovered, indicative of divergent transcription from the p5 promoter. This finding opens the door for novel concepts of the switch between AAV latency and productive replication. In the absence of a suitable animal model to study AAVin vivo, combinedin cellulaeandin silicostudies will help to forward the understanding of the unique, bipartite AAV life cycle.


2018 ◽  
Vol 36 (6_suppl) ◽  
pp. 412-412 ◽  
Author(s):  
Thenappan Chandrasekar ◽  
Alexandre Zlotta ◽  
Jess Shen ◽  
Aidan Noon ◽  
Haiyan Jiang ◽  
...  

412 Background: NMIBC has a highly variable clinical behavior not adequately predicted by histological grade or clinical parameters. Some are indolent; others quickly progress to MIBC. Discrepancies between phenotype and genotype is compounded further by interobserver variability in pathological grading. There is an unmet need to improve the prediction of NMIBC. Methods: Whole transcriptomic analysis of 178 bladder tumors (158 NMIBC, 20 MIBC/metastatic) was performed from FFPE tissue incorporating messenger RNA expression, splice variants, gene fusion, mutation detection and immune checkpoint inhibitor cascades. CTLA, PD-1, LAG3, TIM3, TIGIT and B7 were compiled as an index including all major cascade genes. Data were integrated and tested for correlations with pathological grading and clinical outcomes. Conventional pathological grading for WHO 1973 (Grade 1, 2, 3) and 2004 (LG vs HG) classifications was reviewed by 3 expert uro-pathologists. Kappa statistic for interobserver variability was calculated. For validation we used an independent RNA-seq dataset (n = 209, Hedegaard et al. 2016 Cancer Cell). Results: Unsupervised clustering of RNA-Seq data distinguished 3 molecular subtypes of NMIBC; Molecular Grade Related Index (MGRI) 1, MGRI2, MGRI3. MGRI1 comprised of almost exclusively LG tumors. MGRI3 clustered with HG MIBC. Kappa for interobserver variability of expert pathologists was 0.40 and 0.78 in 1973 and 2004 WHO classification, respectively. FGFR3 mutations, FGFR3::TACC3 fusion events and MGRI1 genes were associated with components of xenobiotic metabolism (p = 2.51x10-09) signalling systems, in particular, GTPase regulation (p = 0.002), respiratory cycle genes (p = 0.004), HOX cluster (p = 0.005). MGRI independently predicted progression to MIBC (n = 138, HR = 2.96, 95%CI = 1.70-5.13, p = 1.20x10-04). 5-year PFS in a combined data set (n = 347) differed significantly for MGRI1 (100%) vs MGRI2 (92.2%) vs MGRI3 (73.5%, p = 1.99x10-05, Gray’s test). PD-1 ICC independently predicted progression (OR = 2.85, p < 0.05). Conclusions: RNA-seq delineates 3 molecular classes of NMIBC with different risks of progression to MIBC compared to conventional histologic grading.


2016 ◽  
Author(s):  
Shruti Kane ◽  
Himanshu Garg ◽  
Neeraja M. Krishnan ◽  
Aditya Singh ◽  
Binay Panda

AbstractRNA sequencing (RNA-seq) is a powerful technology for identification of novel transcripts (coding, non-coding and splice variants), understanding of transcript structures and estimation of gene and/or allelic expression. There are specific challenges that biologists face in determining the number of replicates to use, total number of sequencing reads to generate for detecting marginally differentially expressed transcripts and the number of lanes in a sequencing flow cell to use for the production of right amount of information. Although past studies attempted answering some of these questions, there is a lack of accessible and biologist-friendly mobile applications to answer these questions. Keeping this in mind, we have developed RNAtor, a mobile application for Android platforms, to aid biologists in correctly designing their RNA-seq experiments. The recommendations from RNAtor are based on simulations and real data.Availability and ImplementationThe Android version of RNAtor is available on Google Play Store and the code from GitHub (https://github.com/binaypanda/RNAtor).


Nucleus ◽  
2018 ◽  
Vol 9 (1) ◽  
pp. 410-430 ◽  
Author(s):  
Charlotte Capitanchik ◽  
Charles R. Dixon ◽  
Selene K. Swanson ◽  
Laurence Florens ◽  
Alastair R. W. Kerr ◽  
...  

2019 ◽  
Vol 47 (14) ◽  
pp. 7262-7275 ◽  
Author(s):  
Fahmi W Nazarie ◽  
Barbara Shih ◽  
Tim Angus ◽  
Mark W Barnett ◽  
Sz-Hau Chen ◽  
...  

AbstractRNA-Seq is a powerful transcriptome profiling technology enabling transcript discovery and quantification. Whilst most commonly used for gene-level quantification, the data can be used for the analysis of transcript isoforms. However, when the underlying transcript assemblies are complex, current visualization approaches can be limiting, with splicing events a challenge to interpret. Here, we report on the development of a graph-based visualization method as a complementary approach to understanding transcript diversity from short-read RNA-Seq data. Following the mapping of reads to a reference genome, a read-to-read comparison is performed on all reads mapping to a given gene, producing a weighted similarity matrix between reads. This is used to produce an RNA assembly graph, where nodes represent reads and edges similarity scores between them. The resulting graphs are visualized in 3D space to better appreciate their sometimes large and complex topology, with other information being overlaid on to nodes, e.g. transcript models. Here we demonstrate the utility of this approach, including the unusual structure of these graphs and how they can be used to identify issues in assembly, repetitive sequences within transcripts and splice variants. We believe this approach has the potential to significantly improve our understanding of transcript complexity.


Genomics ◽  
2013 ◽  
Vol 101 (1) ◽  
pp. 57-63 ◽  
Author(s):  
Amrutlal K. Patel ◽  
Vaibhav D. Bhatt ◽  
Ajai K. Tripathi ◽  
Manisha R. Sajnani ◽  
Subhash J. Jakhesara ◽  
...  
Keyword(s):  

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Thu Thi Minh Vo ◽  
Tuan Viet Nguyen ◽  
Gianluca Amoroso ◽  
Tomer Ventura ◽  
Abigail Elizur

Abstract Background The flesh pigmentation of farmed Atlantic salmon is formed by accumulation of carotenoids derived from commercial diets. In the salmon gastrointestinal system, the hindgut is considered critical in the processes of carotenoids uptake and metabolism. In Tasmania, flesh color depletion can noticeably affect farmed Atlantic salmon at different levels of severity following extremely hot summers. In this study, RNA sequencing (RNA-Seq) was performed to investigate the reduction in flesh pigmentation. Library preparation is a key step that significantly impacts the effectiveness of RNA sequencing (RNA-Seq) experiments. Besides the commonly used whole transcript RNA-Seq method, the 3’ mRNA-Seq method is being applied widely, owing to its reduced cost, enabling more repeats to be sequenced at the expense of lower resolution. Therefore, the output of the Illumina TruSeq kit (whole transcript RNA-Seq) and the Lexogen QuantSeq kit (3’ mRNA-Seq) was analyzed to identify genes in the Atlantic salmon hindgut that are differentially expressed (DEGs) between two flesh color phenotypes. Results In both methods, DEGs between the two color phenotypes were associated with metal ion transport, oxidation-reduction processes, and immune responses. We also found DEGs related to lipid metabolism in the QuantSeq method. In the TruSeq method, a missense mutation was detected in DEGs in different flesh color traits. The number of DEGs found in the TruSeq libraries was much higher than the QuantSeq; however, the trend of DEGs in both library methods was similar and validated by qPCR. Conclusions Flesh coloration in Atlantic salmon is related to lipid metabolism in which apolipoproteins, serum albumin and fatty acid-binding protein genes are hypothesized to be linked to the absorption, transport and deposition of carotenoids. Our findings suggest that Grp could inhibit the feeding behavior of low color-banded fish, resulting in the dietary carotenoid shortage. Several SNPs in genes involving in carotenoid-binding cholesterol and oxidative stress were detected in both flesh color phenotypes. Regarding the choice of the library preparation method, the selection criteria depend on the research design and purpose. The 3’ mRNA-Seq method is ideal for targeted identification of highly expressed genes, while the whole RNA-Seq method is recommended for identification of unknown genes, enabling the identification of splice variants and trait-associated SNPs, as we have found for duox2 and duoxa1.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 997
Author(s):  
Shruti Kane ◽  
Himanshu Garg ◽  
Neeraja M. Krishnan ◽  
Aditya Singh ◽  
Binay Panda

RNA sequencing (RNA-seq) is a powerful technology that allows one to assess the RNA levels in a sample. Analysis of these levels can help in identifying novel transcripts (coding, non-coding and splice variants), understanding transcript structures, and estimating gene/allele expression. Biologists face specific challenges while designing RNA-seq experiments. The nature of these challenges lies in determining the total number of sequenced reads and technical replicates required for detecting marginally differentially expressed transcripts. Despite previous attempts to address these challenges, easily-accessible and biologist-friendly mobile applications do not exist. Thus, we developed RNAtor, a mobile application for Android platforms, to aid biologists in correctly designing their RNA-seq experiments. The recommendations from RNAtor are based on simulations and real data.


2020 ◽  
Author(s):  
Marek Cmero ◽  
Breon Schmidt ◽  
Ian J. Majewski ◽  
Paul G. Ekert ◽  
Alicia Oshlack ◽  
...  

AbstractGenomic rearrangements can modify gene function by altering transcript sequences, and have been shown to be drivers in both cancer and rare diseases. Although there are now many methods to detect structural variants from Whole Genome Sequencing (WGS), RNA sequencing (RNA-seq) remains under-utilised as a technology for the detection of gene altering structural variants. Calling fusion genes from RNA-seq data is well established, but other transcriptional variants such as fusions with novel sequence, tandem duplications, large insertions and deletions, and novel splicing are difficult to detect using existing approaches.To identify all types of variants in transcriptomes, we developed MINTIE, an integrated pipeline for RNA-seq data. We take a reference free approach, which combines de novo assembly of transcripts with differential expression analysis, to identify up-regulated novel variants in a case sample.We validated MINTIE on simulated and real data sets and compared it with eight other approaches for finding novel transcriptional variants. We found MINTIE was able to detect all defined variant classes at high rates (>70%) while no other method was able to achieve this.We applied MINTIE to RNA-seq data from a cohort of acute lymphoblastic leukemia (ALL) patient samples and identified several novel clinically relevant variants, including an unpartnered recurrent fusion involving the tumour suppressor gene RB1, and variants in ALL-associated genes: tandem duplications in IKZF1 and PAX5, and novel splicing in ETV6. We further demonstrate the utility of MINTIE to identify rare disease variants using RNA-seq, including the discovery of an inter-chromosomal translocation in the DMD gene in a patient with muscular dystrophy. We posit that MINTIE will be able to identify new disease variants across a range of cancers and other disease types.


Sign in / Sign up

Export Citation Format

Share Document