overlapping transcripts
Recently Published Documents


TOTAL DOCUMENTS

36
(FIVE YEARS 7)

H-INDEX

17
(FIVE YEARS 1)

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Lars Gabriel ◽  
Katharina J. Hoff ◽  
Tomáš Brůna ◽  
Mark Borodovsky ◽  
Mario Stanke

Abstract Background BRAKER is a suite of automatic pipelines, BRAKER1 and BRAKER2, for the accurate annotation of protein-coding genes in eukaryotic genomes. Each pipeline trains statistical models of protein-coding genes based on provided evidence and, then predicts protein-coding genes in genomic sequences using both the extrinsic evidence and statistical models. For training and prediction, BRAKER1 and BRAKER2 incorporate complementary extrinsic evidence: BRAKER1 uses only RNA-seq data while BRAKER2 uses only a database of cross-species proteins. The BRAKER suite has so far not been able to reliably exceed the accuracy of BRAKER1 and BRAKER2 when incorporating both types of evidence simultaneously. Currently, for a novel genome project where both RNA-seq and protein data are available, the best option is to run both pipelines independently, and to pick one, likely better output. Therefore, one or another type of the extrinsic evidence would remain unexploited. Results We present TSEBRA, a software that selects gene predictions (transcripts) from the sets generated by BRAKER1 and BRAKER2. TSEBRA uses a set of rules to compare scores of overlapping transcripts based on their support by RNA-seq and homologous protein evidence. We show in computational experiments on genomes of 11 species that TSEBRA achieves higher accuracy than either BRAKER1 or BRAKER2 running alone and that TSEBRA compares favorably with the combiner tool EVidenceModeler. Conclusion TSEBRA is an easy-to-use and fast software tool. It can be used in concert with the BRAKER pipeline to generate a gene prediction set supported by both RNA-seq and homologous protein evidence.


2021 ◽  
Author(s):  
Gábor Torma ◽  
Dóra Tombácz ◽  
Norbert Moldován ◽  
Ádám Fülöp ◽  
István Prazsák ◽  
...  

Abstract In this study, we used two long-read sequencing (LRS) techniques, Sequel from the Pacific Biosciences and MinION from Oxford Nanopore Technologies, for the transcriptional characterization of a prototype baculovirus, Autographacalifornica multiple nucleopolyhedrovirus. LRS is able to read full-length RNA molecules, and thereby to distinguish between transcript isoforms, mono- and polycistronic RNAs, and overlapping transcripts. Altogether, we detected 875 transcripts, of which 759 are novel and 116 have been annotated previously. These RNA molecules include 41 novel putative protein coding transcript (each containing 5’-truncated in-frame ORFs), 14 monocistronic transcripts, 99 multicistronic RNAs, 101 non-coding RNA, and 504 length isoforms. We also detected RNA methylation in 12 viral genes and RNA hyper-editing in the longer 5’-UTR transcript isoform of ORF 19 gene.


2021 ◽  
Author(s):  
Lars Gabriel ◽  
Katharina J Hoff ◽  
Tomas Bruna ◽  
Mark Borodovsky ◽  
Mario Stanke

Background: BRAKER is a suite of automatic pipelines, BRAKER1 and BRAKER2, for the accurate annotation of protein-coding genes in eukaryotic genomes. Each pipeline trains statistical models of protein-coding genes based on provided evidence and, then predicts protein-coding genes in genomic sequences using both the extrinsic evidence and statistical models. For training and prediction, BRAKER1 and BRAKER2 incorporate complementary extrinsic evidence: BRAKER1 uses only RNA-seq data while BRAKER2 uses only a database of cross-species proteins. The BRAKER suite has so far not been able to reliably exceed the accuracy of BRAKER1 and BRAKER2 when incorporating both types of evidence simultaneously. Currently, for a novel genome project where both RNA-seq and protein data are available, the best option is to run both pipelines independently, and to pick one, likely better output. Therefore, one or another type of the extrinsic evidence would remain unexploited. Results: We present TSEBRA, a software that selects gene predictions (transcripts) from the sets generated by BRAKER1 and BRAKER2. TSEBRA uses a set of rules to compare scores of overlapping transcripts based on their support by RNA-seq and homologous protein evidence. We show in computational experiments on genomes of 11 species that TSEBRA achieves higher accuracy than either BRAKER1 or BRAKER2 running alone and that TSEBRA compares favorably with the combiner tool EVidenceModeler. Conclusion: TSEBRA is an easy-to-use and fast software tool. It can be used in concert with the BRAKER pipeline to generate a gene prediction set supported by both RNA-seq and homologous protein evidence.


2020 ◽  
Vol 48 (18) ◽  
pp. e104-e104 ◽  
Author(s):  
Jingwen Wang ◽  
Bingnan Li ◽  
Sueli Marques ◽  
Lars M Steinmetz ◽  
Wu Wei ◽  
...  

Abstract Eukaryotic transcriptomes are complex, involving thousands of overlapping transcripts. The interleaved nature of the transcriptomes limits our ability to identify regulatory regions, and in some cases can lead to misinterpretation of gene expression. To improve the understanding of the overlapping transcriptomes, we have developed an optimized method, TIF-Seq2, able to sequence simultaneously the 5′ and 3′ ends of individual RNA molecules at single-nucleotide resolution. We investigated the transcriptome of a well characterized human cell line (K562) and identified thousands of unannotated transcript isoforms. By focusing on transcripts which are challenging to be investigated with RNA-Seq, we accurately defined boundaries of lowly expressed unannotated and read-through transcripts putatively encoding fusion genes. We validated our results by targeted long-read sequencing and standard RNA-Seq for chronic myeloid leukaemia patient samples. Taking the advantage of TIF-Seq2, we explored transcription regulation among overlapping units and investigated their crosstalk. We show that most overlapping upstream transcripts use poly(A) sites within the first 2 kb of the downstream transcription units. Our work shows that, by paring the 5′ and 3′ end of each RNA, TIF-Seq2 can improve the annotation of complex genomes, facilitate accurate assignment of promoters to genes and easily identify transcriptionally fused genes.


BioTechniques ◽  
2020 ◽  
Vol 69 (2) ◽  
pp. 141-147
Author(s):  
Faizan Uddin ◽  
Madhulika Srivastava

Reverse transcription-PCR (RT-PCR) is the most widely employed technique for gene expression analysis owing to its high sensitivity, easy reproducibility and fast output. It has been conceived that priming RT reactions with gene-specific primers generates cDNA only from the specific RNA. However, several reports have revealed that cDNA is synthesized even without addition of exogenous primers in RT reactions. Owing to such self-priming activity, the signals from specific strands cannot be accurately detected and can confound the expression analysis, especially in context of overlapping bidirectional transcripts. Here, we demonstrate that purification of biotin-tagged cDNA in conjunction with alkaline denaturation can obviate the problem of background priming and enable accurate strand-specific detection of overlapping transcripts.


Biomolecules ◽  
2019 ◽  
Vol 9 (3) ◽  
pp. 87 ◽  
Author(s):  
Rosa Fontana ◽  
Michela Ranieri ◽  
Girolama La Mantia ◽  
Maria Vivo

The CDKN2a/ARF locus expresses two partially overlapping transcripts that encode two distinct proteins, namely p14ARF (p19Arf in mouse) and p16INK4a, which present no sequence identity. Initial data obtained in mice showed that both proteins are potent tumor suppressors. In line with a tumor-suppressive role, ARF-deficient mice develop lymphomas, sarcomas, and adenocarcinomas, with a median survival rate of one year of age. In humans, the importance of ARF inactivation in cancer is less clear whereas a more obvious role has been documented for p16INK4a. Indeed, many alterations in human tumors result in the elimination of the entire locus, while the majority of point mutations affect p16INK4a. Nevertheless, specific mutations of p14ARF have been described in different types of human cancers such as colorectal and gastric carcinomas, melanoma and glioblastoma. The activity of the tumor suppressor ARF has been shown to rely on both p53-dependent and independent functions. However, novel data collected in the last years has challenged the traditional and established role of this protein as a tumor suppressor. In particular, tumors retaining ARF expression evolve to metastatic and invasive phenotypes and in humans are associated with a poor prognosis. In this review, the recent evidence and the molecular mechanisms of a novel role played by ARF will be presented and discussed, both in pathological and physiological contexts.


eLife ◽  
2017 ◽  
Vol 6 ◽  
Author(s):  
Nicole L Nuckolls ◽  
María Angélica Bravo Núñez ◽  
Michael T Eickbush ◽  
Janet M Young ◽  
Jeffrey J Lange ◽  
...  

Meiotic drivers are selfish genes that bias their transmission into gametes, defying Mendelian inheritance. Despite the significant impact of these genomic parasites on evolution and infertility, few meiotic drive loci have been identified or mechanistically characterized. Here, we demonstrate a complex landscape of meiotic drive genes on chromosome 3 of the fission yeasts Schizosaccharomyces kambucha and S. pombe. We identify S. kambucha wtf4 as one of these genes that acts to kill gametes (known as spores in yeast) that do not inherit the gene from heterozygotes. wtf4 utilizes dual, overlapping transcripts to encode both a gamete-killing poison and an antidote to the poison. To enact drive, all gametes are poisoned, whereas only those that inherit wtf4 are rescued by the antidote. Our work suggests that the wtf multigene family proliferated due to meiotic drive and highlights the power of selfish genes to shape genomes, even while imposing tremendous costs to fertility.


RNA Biology ◽  
2015 ◽  
Vol 12 (5) ◽  
pp. 490-500 ◽  
Author(s):  
José Vicente Gomes-Filho ◽  
Livia Soares Zaramela ◽  
Valéria Cristina da Silva Italiani ◽  
Nitin S Baliga ◽  
Ricardo Z N Vêncio ◽  
...  

Microbiology ◽  
2011 ◽  
Vol 157 (12) ◽  
pp. 3398-3404 ◽  
Author(s):  
Emma Sevilla ◽  
Beatriz Martín-Luna ◽  
Andrés González ◽  
Jesús A. Gonzalo-Asensio ◽  
María Luisa Peleato ◽  
...  

The interplay between Fur (ferric uptake regulator) proteins and small, non-coding RNAs has been described as a key regulatory loop in several bacteria. In the filamentous cyanobacterium Anabaena sp. PCC 7120, a large dicistronic transcript encoding the putative membrane protein Alr1690 and an α-furA RNA is involved in the modulation of the global regulator FurA. In this work we report the existence of three novel antisense RNAs in cyanobacteria and show that a cis α-furA RNA is conserved in very different genomic contexts, namely in the unicellular cyanobacteria Microcystis aeruginosa PCC 7806 and Synechocystis sp. PCC 6803. Syα-fur RNA covers only part of the coding sequence of the fur orthologue sll0567, whose flanking genes encode two hypothetical proteins. Transcriptional analysis of fur and its adjacent genes in Microcystis unravels a highly compact organization of this locus involving overlapping transcripts. Maα-fur RNA spans the whole Mafur CDS and part of the flanking dnaJ and sufE sequences. In addition, Mafur seems to be part of a dicistronic operon encoding this regulator and an α-sufE RNA. These results allow new insights into the transcriptomes of two unicellular cyanobacteria and suggest that in M. aeruginosa PCC 7806, the α-fur and α-sufE RNAs might participate in a regulatory connection between the genes of the dnaJ–fur–sufE locus.


Sign in / Sign up

Export Citation Format

Share Document