reference transcript
Recently Published Documents


TOTAL DOCUMENTS

14
(FIVE YEARS 10)

H-INDEX

4
(FIVE YEARS 2)

2021 ◽  
Vol 6 (1) ◽  
Author(s):  
R. Koster ◽  
R. D. Brandão ◽  
D. Tserpelis ◽  
C. E. P. van Roozendaal ◽  
C. N. van Oosterhoud ◽  
...  

AbstractNeurofibromatosis type 1 (NF1) is caused by loss-of-function variants in the NF1 gene. Approximately 10% of these variants affect RNA splicing and are either missed by conventional DNA diagnostics or are misinterpreted by in silico splicing predictions. Therefore, a targeted RNAseq-based approach was designed to detect pathogenic RNA splicing and associated pathogenic DNA variants. For this method RNA was extracted from lymphocytes, followed by targeted RNAseq. Next, an in-house developed tool (QURNAs) was used to calculate the enrichment score (ERS) for each splicing event. This method was thoroughly tested using two different patient cohorts with known pathogenic splice-variants in NF1. In both cohorts all 56 normal reference transcript exon splice junctions, 24 previously described and 45 novel non-reference splicing events were detected. Additionally, all expected pathogenic splice-variants were detected. Eleven patients with NF1 symptoms were subsequently tested, three of which have a known NF1 DNA variant with a putative effect on RNA splicing. This effect could be confirmed for all 3. The other eight patients were previously without any molecular confirmation of their NF1-diagnosis. A deep-intronic pathogenic splice variant could now be identified for two of them (25%). These results suggest that targeted RNAseq can be successfully used to detect pathogenic RNA splicing variants in NF1.


2021 ◽  
Author(s):  
Wenbin Guo ◽  
Max Coulter ◽  
Robbie Waugh ◽  
Runxuan Zhang

High quality transcriptome assembly using short reads from RNA-seq data still heavily relies upon reference-based approaches, of which the primary step is to align RNA-seq reads to a single reference genome of haploid sequence. However, it is increasingly apparent that while different genotypes within a species share core genes, they also contain variable numbers of specific genes that are only present a subset of individuals. Using a common reference may thus lead to a loss of genotype-specific information in the assembled transcript dataset and the generation of erroneous, incomplete or misleading transcriptomics analysis results. With the recent development of pan-genome information in many species, it is important that we understand the limitations of single genotype references for transcriptomics analysis. In this study, we quantitively evaluated the advantages of using genotype-specific reference genomes for transcriptome assembly and analysis using cultivated barley as a model. We mapped barley cultivar Barke RNA-seq reads to the Barke genome and to the cultivar Morex genome (common barley genome reference) to construct a genotype specific Reference Transcript Dataset (sRTD) and a common Reference Transcript Datasets (cRTD), respectively. We compared the two RTDs according to their transcript diversity, transcript sequence and structure similarity and the accuracy they provided for transcript quantification and differential expression analysis. Our evaluation shows that the sRTD has a significantly higher diversity of transcripts and alternative splicing events. Despite using a high-quality reference genome for assembly of the cRTD, we miss ca. 40% transcripts present in the sRTD and cRTD only has ca. 70% true assemblies. We found that the sRTD is more accurate for transcript quantification as well as differential expression and differential alternative splicing analysis. However, gene level quantification and comparative expression analysis are less affected by the source RTD, which indicates that analysing transcriptomic data at the gene level may be a reasonable compromise when a high-quality genotype-specific reference is not available.


2021 ◽  
Author(s):  
Max Coulter ◽  
Juan Carlos Entizne ◽  
Wenbin Guo ◽  
Micha Bayer ◽  
Ronja Wonneberger ◽  
...  

Accurate characterization of splice junctions as well as transcription start and end sites in reference transcriptomes allows precise quantification of transcripts from RNA-seq data and enable detailed investigations of transcriptional and post-transcriptional regulation. Using novel computational methods and a combination of PacBio Iso-seq and Illumina short read sequences from 20 diverse tissues and conditions, we generated a comprehensive and highly resolved barley reference transcript dataset (RTD) from the European 2-row spring barley cultivar Barke (BaRTv2.18). Stringent and thorough filtering was carried out to maintain the quality and accuracy of the splice junctions and transcript start and end sites. BaRTv2.18 shows increased transcript diversity and completeness compared to an earlier version, BaRTv1.0. The accuracy of transcript level quantification, splice junctions and transcript start and end sites has been validated extensively using parallel technologies and analysis, including high resolution RT PCR and 5 prime RACE. BaRTv2.18 contains 39,434 genes and 148,260 transcripts, representing the most comprehensive and resolved reference transcriptome in barley to date. It provides an important and high-quality resource for advanced transcriptomic analyses, including both transcriptional and post-transcriptional regulation, with exceptional resolution and precision.


2021 ◽  
Vol 12 ◽  
Author(s):  
Michelle M. Halstead ◽  
Alma Islas-Trejo ◽  
Daniel E. Goszczynski ◽  
Juan F. Medrano ◽  
Huaijun Zhou ◽  
...  

A comprehensive annotation of transcript isoforms in domesticated species is lacking. Especially considering that transcriptome complexity and splicing patterns are not well-conserved between species, this presents a substantial obstacle to genomic selection programs that seek to improve production, disease resistance, and reproduction. Recent advances in long-read sequencing technology have made it possible to directly extrapolate the structure of full-length transcripts without the need for transcript reconstruction. In this study, we demonstrate the power of long-read sequencing for transcriptome annotation by coupling Oxford Nanopore Technology (ONT) with large-scale multiplexing of 93 samples, comprising 32 tissues collected from adult male and female Hereford cattle. More than 30 million uniquely mapping full-length reads were obtained from a single ONT flow cell, and used to identify and characterize the expression dynamics of 99,044 transcript isoforms at 31,824 loci. Of these predicted transcripts, 21% exactly matched a reference transcript, and 61% were novel isoforms of reference genes, substantially increasing the ratio of transcript variants per gene, and suggesting that the complexity of the bovine transcriptome is comparable to that in humans. Over 7,000 transcript isoforms were extremely tissue-specific, and 61% of these were attributed to testis, which exhibited the most complex transcriptome of all interrogated tissues. Despite profiling over 30 tissues, transcription was only detected at about 60% of reference loci. Consequently, additional studies will be necessary to continue characterizing the bovine transcriptome in additional cell types, developmental stages, and physiological conditions. However, by here demonstrating the power of ONT sequencing coupled with large-scale multiplexing, the task of exhaustively annotating the bovine transcriptome – or any mammalian transcriptome – appears significantly more feasible.


2021 ◽  
Author(s):  
R. Koster ◽  
R.D. Brandão ◽  
D. Tserpelis ◽  
C.E.P. van Roozendaal ◽  
C.N. van Oosterhoud ◽  
...  

AbstractPurposeNeurofibromatosis type 1 (NF1) is caused by loss-of-function variants in the NF1 gene. Approximately 10% of these variants affect RNA splicing and are either missed by conventional DNA diagnostics or are misinterpreted by in silico splicing predictions. A targeted RNAseq-based approach was designed to detect pathogenic RNA splicing and associated pathogenic DNA variants.MethodsRNA was extracted from lymphocytes, followed by targeted NF1 RNAseq. An in-house developed tool (QURNAS) was used to calculate the enrichment score (ERS) for each splicing event.ResultsThis method was thoroughly tested using two different patient cohorts with known pathogenic splice-variants. In both cohorts all 56 normal reference transcript exon splice junctions, 24 previously described and 45 novel non-reference splicing events were detected. Additionally, all expected pathogenic splice-variants were detected. Eleven patients with NF1 symptoms were subsequently tested, three of which have a known NF1 DNA variant with a putative effect on RNA splicing. This effect could be confirmed for all 3. The other eight patients were previously without any molecular confirmation of their NF1-diagnosis. A deep-intronic pathogenic splice variant could now be identified for two of them (25%).ConclusionTargeted NF1 RNAseq can be successfully used to detect pathogenic RNA splicing variants, complementary to DNA based diagnostics.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Linda Milne ◽  
Micha Bayer ◽  
Paulo Rapazote-Flores ◽  
Claus-Dieter Mayer ◽  
Robbie Waugh ◽  
...  

AbstractA high-quality, barley gene reference transcript dataset (BaRTv1.0), was used to quantify gene and transcript abundances from 22 RNA-seq experiments, covering 843 separate samples. Using the abundance data we developed a Barley Expression Database (EORNA*) to underpin a visualisation tool that displays comparative gene and transcript abundance data on demand as transcripts per million (TPM) across all samples and all the genes. EORNA provides gene and transcript models for all of the transcripts contained in BaRTV1.0, and these can be conveniently identified through either BaRT or HORVU gene names, or by direct BLAST of query sequences. Browsing the quantification data reveals cultivar, tissue and condition specific gene expression and shows changes in the proportions of individual transcripts that have arisen via alternative splicing. TPM values can be easily extracted to allow users to determine the statistical significance of observed transcript abundance variation among samples or perform meta analyses on multiple RNA-seq experiments. * Eòrna is the Scottish Gaelic word for Barley.


2020 ◽  
Author(s):  
Linda Milne ◽  
Micha Bayer ◽  
Paulo Rapazote-Flores ◽  
Claus-Dieter Mayer ◽  
Robbie Waugh ◽  
...  

AbstractA high-quality, barley gene reference transcript dataset (BaRTv1.0), was used to quantify gene and transcript abundances from 22 RNA-seq experiments, covering 843 separate samples. Using the abundance data we developed a Barley Expression Database (EoRNA* – Expression of RNA) to underpin a visualisation tool that displays comparative gene and transcript abundance data on demand as transcripts per million (TPM) across all samples and all the genes. EoRNA provides gene and transcript models for all of the transcripts contained in BaRTV1.0, and these can be conveniently identified through either BaRT or HORVU gene names, or by direct BLAST of query sequences. Browsing the quantification data reveals cultivar, tissue and condition specific gene expression and shows changes in the proportions of individual transcripts that have arisen via alternative splicing. TPM values can be easily extracted to allow users to determine the statistical significance of observed transcript abundance variation among samples or perform meta analyses on multiple RNA-seq experiments. * Eòrna is the Scottish Gaelic word for Barley


Author(s):  
Bidossessi Wilfried Hounkpe ◽  
Francine Chenou ◽  
Franciele de Lima ◽  
Erich Vinicius De Paula

Abstract Housekeeping (HK) genes are constitutively expressed genes that are required for the maintenance of basic cellular functions. Despite their importance in the calibration of gene expression, as well as the understanding of many genomic and evolutionary features, important discrepancies have been observed in studies that previously identified these genes. Here, we present Housekeeping and Reference Transcript Atlas (HRT Atlas v1.0, www.housekeeping.unicamp.br) a web-based database which addresses some of the previously observed limitations in the identification of these genes, and offers a more accurate database of human and mouse HK genes and transcripts. The database was generated by mining massive human and mouse RNA-seq data sets, including 11 281 and 507 high-quality RNA-seq samples from 52 human non-disease tissues/cells and 14 healthy tissues/cells of C57BL/6 wild type mouse, respectively. User can visualize the expression and download lists of 2158 human HK transcripts from 2176 HK genes and 3024 mouse HK transcripts from 3277 mouse HK genes. HRT Atlas also offers the most stable and suitable tissue selective candidate reference transcripts for normalization of qPCR experiments. Specific primers and predicted modifiers of gene expression for some of these HK transcripts are also proposed. HRT Atlas has also been integrated with a regulatory elements resource from Epiregio server.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Paulo Rapazote-Flores ◽  
Micha Bayer ◽  
Linda Milne ◽  
Claus-Dieter Mayer ◽  
John Fuller ◽  
...  

Abstract Background The time required to analyse RNA-seq data varies considerably, due to discrete steps for computational assembly, quantification of gene expression and splicing analysis. Recent fast non-alignment tools such as Kallisto and Salmon overcome these problems, but these tools require a high quality, comprehensive reference transcripts dataset (RTD), which are rarely available in plants. Results A high-quality, non-redundant barley gene RTD and database (Barley Reference Transcripts – BaRTv1.0) has been generated. BaRTv1.0, was constructed from a range of tissues, cultivars and abiotic treatments and transcripts assembled and aligned to the barley cv. Morex reference genome (Mascher et al. Nature; 544: 427–433, 2017). Full-length cDNAs from the barley variety Haruna nijo (Matsumoto et al. Plant Physiol; 156: 20–28, 2011) determined transcript coverage, and high-resolution RT-PCR validated alternatively spliced (AS) transcripts of 86 genes in five different organs and tissue. These methods were used as benchmarks to select an optimal barley RTD. BaRTv1.0-Quantification of Alternatively Spliced Isoforms (QUASI) was also made to overcome inaccurate quantification due to variation in 5′ and 3′ UTR ends of transcripts. BaRTv1.0-QUASI was used for accurate transcript quantification of RNA-seq data of five barley organs/tissues. This analysis identified 20,972 significant differentially expressed genes, 2791 differentially alternatively spliced genes and 2768 transcripts with differential transcript usage. Conclusion A high confidence barley reference transcript dataset consisting of 60,444 genes with 177,240 transcripts has been generated. Compared to current barley transcripts, BaRTv1.0 transcripts are generally longer, have less fragmentation and improved gene models that are well supported by splice junction reads. Precise transcript quantification using BaRTv1.0 allows routine analysis of gene expression and AS.


2019 ◽  
Author(s):  
Paulo Rapazote-Flores ◽  
Micha Bayer ◽  
Linda Milne ◽  
Claus-Dieter Mayer ◽  
John Fuller ◽  
...  

AbstractBackgroundTime consuming computational assembly and quantification of gene expression and splicing analysis from RNA-seq data vary considerably. Recent fast non-alignment tools such as Kallisto and Salmon overcome these problems, but these tools require a high quality, comprehensive reference transcripts dataset (RTD), which are rarely available in plants.ResultsA high-quality, non-redundant barley gene RTD and database (Barley Reference Transcripts – BaRTv1.0) has been generated. BaRTv1.0, was constructed from a range of tissues, cultivars and abiotic treatments and transcripts assembled and aligned to the barley cv. Morex reference genome (Mascher et al., 2017). Full-length cDNAs from the barley variety Haruna nijo (Matsumoto et al., 2011) determined transcript coverage, and high-resolution RT-PCR validated alternatively spliced (AS) transcripts of 86 genes in five different organs and tissue. These methods were used as benchmarks to select an optimal barley RTD. BaRTv1.0-Quantification of Alternatively Spliced Isoforms (QUASI) was also made to overcome inaccurate quantification due to variation in 5’ and 3’ UTR ends of transcripts. BaRTv1.0-QUASI was used for accurate transcript quantification of RNA-seq data of five barley organs/tissues. This analysis identified 20,972 significant differentially expressed genes, 2,791 differentially alternatively spliced genes and 2,768 transcripts with differential transcript usage.ConclusionA high confidence barley reference transcript dataset consisting of 60,444 genes with 177,240 transcripts has been generated. Compared to current barley transcripts, BaRTv1.0 transcripts are generally longer, have less fragmentation and improved gene models that are well supported by splice junction reads. Precise transcript quantification using BaRTv1.0 allows routine analysis of gene expression and AS.


Sign in / Sign up

Export Citation Format

Share Document