scholarly journals Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Ryan Lusk ◽  
Evan Stene ◽  
Farnoush Banaei-Kashani ◽  
Boris Tabakoff ◽  
Katerina Kechris ◽  
...  

AbstractAnnotation of polyadenylation sites from short-read RNA sequencing alone is a challenging computational task. Other algorithms rooted in DNA sequence predict potential polyadenylation sites; however, in vivo expression of a particular site varies based on a myriad of conditions. Here, we introduce aptardi (alternative polyadenylation transcriptome analysis from RNA-Seq data and DNA sequence information), which leverages both DNA sequence and RNA sequencing in a machine learning paradigm to predict expressed polyadenylation sites. Specifically, as input aptardi takes DNA nucleotide sequence, genome-aligned RNA-Seq data, and an initial transcriptome. The program evaluates these initial transcripts to identify expressed polyadenylation sites in the biological sample and refines transcript 3′-ends accordingly. The average precision of the aptardi model is twice that of a standard transcriptome assembler. In particular, the recall of the aptardi model (the proportion of true polyadenylation sites detected by the algorithm) is improved by over three-fold. Also, the model—trained using the Human Brain Reference RNA commercial standard—performs well when applied to RNA-sequencing samples from different tissues and different mammalian species. Finally, aptardi’s input is simple to compile and its output is easily amenable to downstream analyses such as quantitation and differential expression.

2021 ◽  
Author(s):  
Venkateswara R. Sripathi ◽  
Varsha C. Anche ◽  
Zachary B. Gossett ◽  
Lloyd T. Walker

RNA sequencing (RNA-Seq) is the leading, routine, high-throughput, and cost-effective next-generation sequencing (NGS) approach for mapping and quantifying transcriptomes, and determining the transcriptional structure. The transcriptome is a complete collection of transcripts found in a cell or tissue or organism at a given time point or specific developmental or environmental or physiological condition. The emergence and evolution of RNA-Seq chemistries have changed the landscape and the pace of transcriptome research in life sciences over a decade. This chapter introduces RNA-Seq and surveys its recent food and agriculture applications, ranging from differential gene expression, variants calling and detection, allele-specific expression, alternative splicing, alternative polyadenylation site usage, microRNA profiling, circular RNAs, single-cell RNA-Seq, metatranscriptomics, and systems biology. A few popular RNA-Seq databases and analysis tools are also presented for each application. We began to witness the broader impacts of RNA-Seq in addressing complex biological questions in food and agriculture.


2020 ◽  
Author(s):  
Taylor W Bailey ◽  
Andrea Santos ◽  
Naila Cannes de Nascimento ◽  
M. Preeti Sivasankar ◽  
Abigail Cox

Abstract Background Voice disorders are a worldwide problem impacting human health, particularly for occupational voice users. Avoidance of surface dehydration is commonly prescribed as a protective factor against the development of dysphonia. The available literature inconclusively supports this practice and a biological mechanism for how surface dehydration of the laryngeal tissue affects voice has not been described. In this study, we used an in vivo male New Zealand white rabbit model to elucidate biological changes based on gene expression within the vocal folds from surface dehydration. Surface dehydration was induced by exposure to low humidity air (18.6% ± 4.3%) for 8 hours. Exposure to moderate humidity (43.0% ± 4.3%) served as the control condition. Ilumina-based RNA sequencing was performed and used for transcriptome analysis with validation by RT-qPCR. Results There were 103 genes identified through Cuffdiff with 64 genes meeting significance by both false discovery rate and fold change. Functional annotation enrichment and predicted protein interaction mapping showed enrichment of various loci, including cellular stress and inflammatory response, ciliary function, and keratinocyte development. Eight genes were selected for RT-qPCR validation. Matrix metalloproteinase 12 (MMP12) and macrophage cationic peptide 1 (MCP1) were significantly upregulated and an epithelial chloride channel protein (ECCP) was significantly downregulated after surface dehydration by RNA-Seq and RT-qPCR. Suprabasin (SPBN) and zinc activated cationic channel (ZACN) were marginally, but non-significantly down- and upregulated by RT-qPCR, respectively. Conclusions The data together support the notion that surface dehydration induces physiological changes in the vocal folds and justifies targeted analysis to further explore the underlying biology of compensatory fluid/ion flux and inflammatory mediators in response to airway surface dehydration.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Zsolt Balázs ◽  
Dóra Tombácz ◽  
Zsolt Csabai ◽  
Norbert Moldován ◽  
Michael Snyder ◽  
...  

Abstract Background Alternative polyadenylation is commonly examined using cDNA sequencing, which is known to be affected by template-switching artifacts. However, the effects of such template-switching artifacts on alternative polyadenylation are generally disregarded, while alternative polyadenylation artifacts are attributed to internal priming. Results Here, we analyzed both long-read cDNA sequencing and direct RNA sequencing data of two organisms, generated by different sequencing platforms. We developed a filtering algorithm which takes into consideration that template-switching can be a source of artifactual polyadenylation when filtering out spurious polyadenylation sites. The algorithm outperformed the conventional internal priming filters based on comparison to direct RNA sequencing data. We also showed that the polyadenylation artifacts arise in cDNA sequencing at consecutive stretches of as few as three adenines. There was no substantial difference between the lengths of poly(A) tails at the artifactual and the true transcriptional end sites even though it is expected that internal priming artifacts have shorter poly(A) tails than genuine polyadenylated reads. Conclusions Our findings suggest that template switching plays an important role in the generation of spurious polyadenylation and support the need for more rigorous filtering of artifactual polyadenylation sites in cDNA data, or that alternative polyadenylation should be annotated using native RNA sequencing.


2020 ◽  
Vol 8 (Suppl 3) ◽  
pp. A268-A269
Author(s):  
Kartik Sehgal ◽  
Andrew Portell ◽  
Elena Ivanova ◽  
Patrick Lizotte ◽  
Navin Mahadevan ◽  
...  

BackgroundTo understand fundamental mechanisms of immune escape, we leveraged our functional ex vivo platform of murine derived organotypic tumor spheroids (DOTS)1 to determine if drug-tolerant persister cells analogous to oncogene targeted therapies limit efficacy of programmed death (PD)-1 blockade, and to identify therapeutic vulnerabilities to overcome anti-PD-1 (αPD-1) resistance.MethodsMurine syngeneic cancer models with well-characterized response to αPD-1 therapy were chosen: MC38 (sensitive) and CT26 (partially resistant). Bulk and single-cell (sc) RNA-sequencing (RNA-seq) were performed on αPD-1 treated DOTS. In vitro culture studies were conducted with or without cytokines (100 ng/ml) or drugs (500 nM). In vivo studies in mice bearing MC38 or CT26 tumors evaluated the combinatorial strategy with PD-1 blockade. We further evaluated our findings in scRNA-seq of an αPD-1 refractory colorectal cancer (CRC) patient tumor.2ResultsBulk RNA-seq of αPD-1 treated DOTS revealed a mesenchymal resistant phenotype with upregulated TNF-α/NFκB signaling (figure 1). scRNA-seq further identified a discrete sub-population of immunotherapy persister cells (IPCs). These cells expressed a stem-like phenotype including downregulation of E2F targets indicative of quiescence, suppression of interferon-γ response genes, induction of hybrid epithelial-to-mesenchymal state, and active IL-6 signaling (figure 1). Ly6a/stem cell antigen-1 (Sca-1) and Snai1 were found to be differentially upregulated in IPCs resistant to PD-1 blockade (not shown). Sca-1 positivity was confirmed in pre-existing tumor populations in vitro (figure 2). When enriched via sorting, these cells remained more persistently Sca-1+ at 96 hours in culture of CT26 compared to MC38 cells, related to increased autocrine IL-6 production by CT26 Sca-1+ cells. Indeed, IL-6 supplementation was capable of expanding Sca-1+ cells in culture (figure 2). Sca-1+ cells expressing ovalbumin peptide were refractory to OT-1 T cell mediated killing and failed to upregulate MHC class-1 antigen presentation (H-2Kb) in response to IL-6, in contrast to interferon-γ (not shown). Analysis of RNA-seq data further identified Birc2/3 as potential targets limiting TNF-mediated apoptosis of these cells (not shown). Notably, Birc2/3 antagonism depleted Sca-1+ IPCs in vitro and significantly potentiated the impact of PD-1 blockade in vivo in MC38, and less robustly in CT26 (figure 3). Evaluation in a microsatellite-instability high CRC patient identified a pre-existent IPC subpopulation within the αPD-1 refractory pre-treatment tumor, with high SNAI1 expression compared to CRC samples in TCGA (figure 4).Abstract 248 Figure 1Bulk and single-cell (sc) RNA-sequencing (RNA-seq) of MDOTS identifies an anti-PD-1 (αPD-1) resistant subpopulation of persister cells. IgG= isotype controlAbstract 248 Figure 2Pre-existent population of stem cell antigen-1 (Sca-1)+ cells expands in response to interleukin-6 (IL-6), as characterized by flow cytometry evaluation in murine syngeneic cancer models at baseline and after purification by fluorescence-activated cell sorting (FACS). H = hoursAbstract 248 Figure 3Combination of anti-PD-1 therapy with Birc2/3 antagonism increases tumor responses and improves survival. CR = complete responseAbstract 248 Figure 4Single-cell RNA-sequencing (scRNA-seq) of a pre-treatment microsatellite-instability (MSI-H) colorectal cancer (CRC) patient tumor, refractory to anti-PD-1 (αPD-1) therapy, reveals presence of SNAI1-high immunotherapy persister cellsConclusionsHigh-resolution functional ex vivo profiling identified Sca-1+/Snai1high stem-like ‘immunotherapy persister cells‘ and uncovered their anti-apoptotic dependencies targetable with Birc2/3 antagonism to augment αPD-1 efficacy.Ethics ApprovalThis study was approved by the Dana-Farber Animal Care and Use Committee and Novartis Institutional Animal Care and Use Committee. Informed written consent to participate in Dana-Farber/Harvard Cancer Center institutional review board (IRB)-approved research protocols was obtained from the human subject. A copy of the written consent is available for review by the Editor of this journal. The study was conducted per the WMA Declaration of Helsinki and IRB-approved protocols.ReferencesJenkins RW, Aref AR, Lizotte PH, Ivanova E, Stinson S, Zhou CW, et al. Ex Vivo Profiling of PD-1 Blockade using organotypic tumor spheroids. Cancer Discov. 2018;8(2):196–668 215.Gurjao C, Liu D, Hofree M, AlDubayan SH, Wakiro I, Su MJ, et al. intrinsic resistance to immune checkpoint blockade in a mismatch repair-deficient colorectal cancer. Cancer Immunol Res 2019;7(8):1230–6.


2020 ◽  
Author(s):  
Naima Ahmed Fahmi ◽  
Jae-Woong Chang ◽  
Heba Nassereddeen ◽  
Khandakar Tanvir Ahmed ◽  
Deliang Fan ◽  
...  

AbstractThe eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3’-untranslated region (3’-UTR) of mRNA produces transcripts with shorter 3’-UTR. Often, 3’-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3’-UTR APA provides a means to regulate gene expression at the post-transcriptional level and is known to promote translation. Current bioinformatics pipelines have limited capability in profiling 3’-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3’-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3’-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations. APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3’-UTR transcripts in the RNA-seq data. The performance of APA-Scan was validated by qPCR.ImplementationAPA-Scan is implemented in Python. Source code and a comprehensive user’s manual are freely available at https://github.com/compbiolabucf/APA-Scan


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ankeeta Shah ◽  
Briana E. Mittleman ◽  
Yoav Gilad ◽  
Yang I. Li

Abstract Background Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3′ ends. Most APA occurs within 3′ UTRs, which harbor regulatory elements that can impact mRNA stability, translation, and localization. Results APA can be profiled using a number of established computational tools that infer polyadenylation sites from standard, short-read RNA-seq datasets. Here, we benchmarked a number of such tools—TAPAS, QAPA, DaPars2, GETUTR, and APATrap— against 3′-Seq, a specialized RNA-seq protocol that enriches for reads at the 3′ ends of genes, and Iso-Seq, a Pacific Biosciences (PacBio) single-molecule full-length RNA-seq method in their ability to identify polyadenylation sites and quantify polyadenylation site usage. We demonstrate that 3′-Seq and Iso-Seq are able to identify and quantify the usage of polyadenylation sites more reliably than computational tools that take short-read RNA-seq as input. However, we find that running one such tool, QAPA, with a set of polyadenylation site annotations derived from small quantities of 3′-Seq or Iso-Seq can reliably quantify variation in APA across conditions, such asacross genotypes, as demonstrated by the successful mapping of alternative polyadenylation quantitative trait loci (apaQTL). Conclusions We envisage that our analyses will shed light on the advantages of studying APA with more specialized sequencing protocols, such as 3′-Seq or Iso-Seq, and the limitations of studying APA with short-read RNA-seq. We provide a computational pipeline to aid in the identification of polyadenylation sites and quantification of polyadenylation site usages using Iso-Seq data as input.


2020 ◽  
Vol 36 (12) ◽  
pp. 3907-3909 ◽  
Author(s):  
Ruijia Wang ◽  
Bin Tian

Abstract Summary Most eukaryotic genes produce alternative polyadenylation (APA) isoforms. APA is dynamically regulated under different growth and differentiation conditions. Here, we present a bioinformatics package, named APAlyzer, for examining 3′UTR APA, intronic APA and gene expression changes using RNA-seq data and annotated polyadenylation sites in the PolyA_DB database. Using APAlyzer and data from the GTEx database, we present APA profiles across human tissues. Availability and implementation APAlyzer is freely available at https://bioconductor.org/packages/release/bioc/html/APAlyzer.html as an R/Bioconductor package. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Zsolt Balázs ◽  
Dóra Tombácz ◽  
Zsolt Csabai ◽  
Norbert Moldován ◽  
Michael Snyder ◽  
...  

Abstract Background: Alternative polyadenylation is commonly examined using cDNA sequencing, which is known to be affected by template-switching artifacts. However, the effects of such template-switching artifacts on alternative polyadenylation are generally disregarded, while alternative polyadenylation artifacts are attributed to internal priming. Results: Here, we analyzed both long-read cDNA sequencing and direct RNA sequencing data of two organisms, generated by different sequencing platforms. We developed a filtering algorithm which takes into consideration that template-switching can be a source of artifactual polyadenylation when filtering out spurious polyadenylation sites. The algorithm outperformed the conventional internal priming filters based on comparison to direct RNA sequencing data. We also showed that the polyadenylation artifacts arise in cDNA sequencing at consecutive stretches of as few as three adenines. There was no substantial difference between the lengths of poly(A) tails at the artifactual and the true transcriptional end sites even though it is expected that internal priming artifacts have shorter poly(A) tails than genuine polyadenylated reads. Conclusions: Our findings suggest that template switching plays an important role in the generation of spurious polyadenylation and support the need for more rigorous filtering of artifactual polyadenylation sites in cDNA data, or that alternative polyadenylation should be annotated using native RNA sequencing.


2018 ◽  
Vol 34 (11) ◽  
pp. 1841-1849 ◽  
Author(s):  
Congting Ye ◽  
Yuqi Long ◽  
Guoli Ji ◽  
Qingshun Quinn Li ◽  
Xiaohui Wu

2020 ◽  
Author(s):  
Taylor W Bailey ◽  
Andrea Pires dos Santos ◽  
Naila Cannes do Nascimento ◽  
Shaojun Xie ◽  
Jyothi Thimmapuram ◽  
...  

Abstract Background: Voice disorders are a worldwide problem impacting human health, particularly for occupational voice users. Avoidance of surface dehydration is commonly prescribed as a protective factor against the development of dysphonia. The available literature inconclusively supports this practice and a biological mechanism for how surface dehydration of the laryngeal tissue affects voice has not been described. In this study, we used an in vivo male New Zealand white rabbit model to elucidate biological changes based on gene expression within the vocal folds from surface dehydration. Surface dehydration was induced by exposure to low humidity air (18.6% + 4.3%) for 8 hours. Exposure to moderate humidity (43.0% + 4.3%) served as the control condition. Ilumina-based RNA sequencing was performed and used for transcriptome analysis with validation by RT-qPCR. Results: There were 103 statistically significant differentially expressed genes identified through Cuffdiff with 61 genes meeting significance by both false discovery rate and fold change. Functional annotation enrichment and predicted protein interaction mapping showed enrichment of various loci, including cellular stress and inflammatory response, ciliary function, and keratinocyte development. Eight genes were selected for RT-qPCR validation. Matrix metalloproteinase 12 (MMP12) and macrophage cationic peptide 1 (MCP1) were significantly upregulated and an epithelial chloride channel protein (ECCP) was significantly downregulated after surface dehydration by RNA-Seq and RT-qPCR. Suprabasin (SPBN) and zinc activated cationic channel (ZACN) were marginally, but non-significantly down- and upregulated as evidenced by RT-qPCR, respectively. Conclusions: The data together support the notion that surface dehydration induces physiological changes in the vocal folds and justifies targeted analysis to further explore the underlying biology of compensatory fluid/ion flux and inflammatory mediators in response to airway surface dehydration.


Sign in / Sign up

Export Citation Format

Share Document