scholarly journals Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Brian J. Haas ◽  
Alexander Dobin ◽  
Bo Li ◽  
Nicolas Stransky ◽  
Nathalie Pochet ◽  
...  

Abstract Background Accurate fusion transcript detection is essential for comprehensive characterization of cancer transcriptomes. Over the last decade, multiple bioinformatic tools have been developed to predict fusions from RNA-seq, based on either read mapping or de novo fusion transcript assembly. Results We benchmark 23 different methods including applications we develop, STAR-Fusion and TrinityFusion, leveraging both simulated and real RNA-seq. Overall, STAR-Fusion, Arriba, and STAR-SEQR are the most accurate and fastest for fusion detection on cancer transcriptomes. Conclusion The lower accuracy of de novo assembly-based methods notwithstanding, they are useful for reconstructing fusion isoforms and tumor viruses, both of which are important in cancer research.

2019 ◽  
Vol 11 (1) ◽  
Author(s):  
Anastasiya Kazachenka ◽  
George R. Young ◽  
Jan Attig ◽  
Chrysoula Kordella ◽  
Eleftheria Lamprianidou ◽  
...  

Abstract Background Myelodysplastic syndromes (MDS) and acute myeloid leukaemia (AML) are characterised by abnormal epigenetic repression and differentiation of bone marrow haematopoietic stem cells (HSCs). Drugs that reverse epigenetic repression, such as 5-azacytidine (5-AZA), induce haematological improvement in half of treated patients. Although the mechanisms underlying therapy success are not yet clear, induction of endogenous retroelements (EREs) has been hypothesised. Methods Using RNA sequencing (RNA-seq), we compared the transcription of EREs in bone marrow HSCs from a new cohort of MDS and chronic myelomonocytic leukaemia (CMML) patients before and after 5-AZA treatment with HSCs from healthy donors and AML patients. We further examined ERE transcription using the most comprehensive annotation of ERE-overlapping transcripts expressed in HSCs, generated here by de novo transcript assembly and supported by full-length RNA-seq. Results Consistent with prior reports, we found that treatment with 5-AZA increased the representation of ERE-derived RNA-seq reads in the transcriptome. However, such increases were comparable between treatment responses and failures. The extended view of HSC transcriptional diversity offered by de novo transcript assembly argued against 5-AZA-responsive EREs as determinants of the outcome of therapy. Instead, it uncovered pre-treatment expression and alternative splicing of developmentally regulated gene transcripts as predictors of the response of MDS and CMML patients to 5-AZA treatment. Conclusions Our study identifies the developmentally regulated transcriptional signatures of protein-coding and non-coding genes, rather than EREs, as correlates of a favourable response of MDS and CMML patients to 5-AZA treatment and offers novel candidates for further evaluation.


2021 ◽  
Author(s):  
Francisco R Marin ◽  
Alberto Davalos ◽  
Dylan Kiltschewskij-Brown ◽  
Maria C Crespo ◽  
Murray J Cairns ◽  
...  

Although genomes from many edible mushrooms are sequenced, studies on fungal miRNAs are scarce. Most of the bioinformatic tools are designed for plants or animals but fungal miRNAs processing and expression share similarities and differences with both kingdoms. Moreover, since mushroom species such as Agaricus bisporus (white button mushroom) are frequently consumed as food, controversial discussions are still evaluating whether their miRNAs might or might not be assimilated, perhaps within extracellular vesicles (i.e exosomes). Therefore, the A. bisporus RNA-seq was studied in order to identify potential de novo miRNA-like small RNAs (milRNAs) that might allow their later detection in the diet. Results pointed to 1 already known and 37 de novo milRNAss. Three milRNAss were selected for RT-qPCR experiments. Precursors and mature milRNAs were found in the edible parts (caps and stipes) validating the predictions carried out in silico. When their potential gene targets were investigated, results pointed that mostly were involved in primary and secondary metabolic regulation. However, when human transcriptome is used as target, the results suggest that they might interfere with important biological processes related with cancer, infectious process and neurodegenerative diseases.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Surajit Bhattacharya ◽  
Hayk Barseghyan ◽  
Emmanuèle C. Délot ◽  
Eric Vilain

Abstract Background Whole genome sequencing is effective at identification of small variants, but because it is based on short reads, assessment of structural variants (SVs) is limited. The advent of Optical Genome Mapping (OGM), which utilizes long fluorescently labeled DNA molecules for de novo genome assembly and SV calling, has allowed for increased sensitivity and specificity in SV detection. However, compared to small variant annotation tools, OGM-based SV annotation software has seen little development, and currently available SV annotation tools do not provide sufficient information for determination of variant pathogenicity. Results We developed an R-based package, nanotatoR, which provides comprehensive annotation as a tool for SV classification. nanotatoR uses both external (DGV; DECIPHER; Bionano Genomics BNDB) and internal (user-defined) databases to estimate SV frequency. Human genome reference GRCh37/38-based BED files are used to annotate SVs with overlapping, upstream, and downstream genes. Overlap percentages and distances for nearest genes are calculated and can be used for filtration. A primary gene list is extracted from public databases based on the patient’s phenotype and used to filter genes overlapping SVs, providing the analyst with an easy way to prioritize variants. If available, expression of overlapping or nearby genes of interest is extracted (e.g. from an RNA-Seq dataset, allowing the user to assess the effects of SVs on the transcriptome). Most quality-control filtration parameters are customizable by the user. The output is given in an Excel file format, subdivided into multiple sheets based on SV type and inheritance pattern (INDELs, inversions, translocations, de novo, etc.). nanotatoR passed all quality and run time criteria of Bioconductor, where it was accepted in the April 2019 release. We evaluated nanotatoR’s annotation capabilities using publicly available reference datasets: the singleton sample NA12878, mapped with two types of enzyme labeling, and the NA24143 trio. nanotatoR was also able to accurately filter the known pathogenic variants in a cohort of patients with Duchenne Muscular Dystrophy for which we had previously demonstrated the diagnostic ability of OGM. Conclusions The extensive annotation enables users to rapidly identify potential pathogenic SVs, a critical step toward use of OGM in the clinical setting.


Author(s):  
Martin Philpott ◽  
Jonathan Watson ◽  
Anjan Thakurta ◽  
Tom Brown ◽  
Tom Brown ◽  
...  

AbstractHere we describe single-cell corrected long-read sequencing (scCOLOR-seq), which enables error correction of barcode and unique molecular identifier oligonucleotide sequences and permits standalone cDNA nanopore sequencing of single cells. Barcodes and unique molecular identifiers are synthesized using dimeric nucleotide building blocks that allow error detection. We illustrate the use of the method for evaluating barcode assignment accuracy, differential isoform usage in myeloma cell lines, and fusion transcript detection in a sarcoma cell line.


Plants ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 1465
Author(s):  
Ramon de Koning ◽  
Raphaël Kiekens ◽  
Mary Esther Muyoka Toili ◽  
Geert Angenon

Raffinose family oligosaccharides (RFO) play an important role in plants but are also considered to be antinutritional factors. A profound understanding of the galactinol and RFO biosynthetic gene families and the expression patterns of the individual genes is a prerequisite for the sustainable reduction of the RFO content in the seeds, without compromising normal plant development and functioning. In this paper, an overview of the annotation and genetic structure of all galactinol- and RFO biosynthesis genes is given for soybean and common bean. In common bean, three galactinol synthase genes, two raffinose synthase genes and one stachyose synthase gene were identified for the first time. To discover the expression patterns of these genes in different tissues, two expression atlases have been created through re-analysis of publicly available RNA-seq data. De novo expression analysis through an RNA-seq study during seed development of three varieties of common bean gave more insight into the expression patterns of these genes during the seed development. The results of the expression analysis suggest that different classes of galactinol- and RFO synthase genes have tissue-specific expression patterns in soybean and common bean. With the obtained knowledge, important galactinol- and RFO synthase genes that specifically play a key role in the accumulation of RFOs in the seeds are identified. These candidate genes may play a pivotal role in reducing the RFO content in the seeds of important legumes which could improve the nutritional quality of these beans and would solve the discomforts associated with their consumption.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Álvaro Figueroa ◽  
Antonio Brante ◽  
Leyla Cárdenas

AbstractThe polychaete Boccardia wellingtonensis is a poecilogonous species that produces different larval types. Females may lay Type I capsules, in which only planktotrophic larvae are present, or Type III capsules that contain planktotrophic and adelphophagic larvae as well as nurse eggs. While planktotrophic larvae do not feed during encapsulation, adelphophagic larvae develop by feeding on nurse eggs and on other larvae inside the capsules and hatch at the juvenile stage. Previous works have not found differences in the morphology between the two larval types; thus, the factors explaining contrasting feeding abilities in larvae of this species are still unknown. In this paper, we use a transcriptomic approach to study the cellular and genetic mechanisms underlying the different larval trophic modes of B. wellingtonensis. By using approximately 624 million high-quality reads, we assemble the de novo transcriptome with 133,314 contigs, coding 32,390 putative proteins. We identify 5221 genes that are up-regulated in larval stages compared to their expression in adult individuals. The genetic expression profile differed between larval trophic modes, with genes involved in lipid metabolism and chaetogenesis over expressed in planktotrophic larvae. In contrast, up-regulated genes in adelphophagic larvae were associated with DNA replication and mRNA synthesis.


Gene ◽  
2018 ◽  
Vol 645 ◽  
pp. 146-156 ◽  
Author(s):  
Soumyadev Sarkar ◽  
Somnath Chakravorty ◽  
Avishek Mukherjee ◽  
Debanjana Bhattacharya ◽  
Semantee Bhattacharya ◽  
...  

PLoS ONE ◽  
2016 ◽  
Vol 11 (3) ◽  
pp. e0150273 ◽  
Author(s):  
Shivanjali Kotwal ◽  
Sanjana Kaul ◽  
Pooja Sharma ◽  
Mehak Gupta ◽  
Rama Shankar ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document