scholarly journals Fusion-Bloom: fusion detection in assembled transcriptomes

2019 ◽  
Vol 36 (7) ◽  
pp. 2256-2257
Author(s):  
Readman Chiu ◽  
Ka Ming Nip ◽  
Inanc Birol

Abstract Summary Presence or absence of gene fusions is one of the most important diagnostic markers in many cancer types. Consequently, fusion detection methods using various genomics data types, such as RNA sequencing (RNA-seq) are valuable tools for research and clinical applications. While information-rich RNA-seq data have proven to be instrumental in discovery of a number of hallmark fusion events, bioinformatics tools to detect fusions still have room for improvement. Here, we present Fusion-Bloom, a fusion detection method that leverages recent developments in de novo transcriptome assembly and assembly-based structural variant calling technologies (RNA-Bloom and PAVFinder, respectively). We benchmarked Fusion-Bloom against the performance of five other state-of-the-art fusion detection tools using multiple datasets. Overall, we observed Fusion-Bloom to display a good balance between detection sensitivity and specificity. We expect the tool to find applications in translational research and clinical genomics pipelines. Availability and implementation Fusion-Bloom is implemented as a UNIX Make utility, available at https://github.com/bcgsc/pavfinder and released under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Vol 35 (14) ◽  
pp. i225-i232 ◽  
Author(s):  
Xiao Yang ◽  
Yasushi Saito ◽  
Arjun Rao ◽  
Hyunsung John Kim ◽  
Pranav Singh ◽  
...  

Abstract Motivation Cell-free nucleic acid (cfNA) sequencing data require improvements to existing fusion detection methods along multiple axes: high depth of sequencing, low allele fractions, short fragment lengths and specialized barcodes, such as unique molecular identifiers. Results AF4 was developed to address these challenges. It uses a novel alignment-free kmer-based method to detect candidate fusion fragments with high sensitivity and orders of magnitude faster than existing tools. Candidate fragments are then filtered using a max-cover criterion that significantly reduces spurious matches while retaining authentic fusion fragments. This efficient first stage reduces the data sufficiently that commonly used criteria can process the remaining information, or sophisticated filtering policies that may not scale to the raw reads can be used. AF4 provides both targeted and de novo fusion detection modes. We demonstrate both modes in benchmark simulated and real RNA-seq data as well as clinical and cell-line cfNA data. Availability and implementation AF4 is open sourced, licensed under Apache License 2.0, and is available at: https://github.com/grailbio/bio/tree/master/fusion.


PLoS ONE ◽  
2015 ◽  
Vol 10 (5) ◽  
pp. e0125722 ◽  
Author(s):  
Yuli Li ◽  
Xiliang Wang ◽  
Tingting Chen ◽  
Fuwen Yao ◽  
Cuiping Li ◽  
...  

PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3702 ◽  
Author(s):  
Santiago Montero-Mendieta ◽  
Manfred Grabherr ◽  
Henrik Lantz ◽  
Ignacio De la Riva ◽  
Jennifer A. Leonard ◽  
...  

Whole genome sequencing (WGS) is a very valuable resource to understand the evolutionary history of poorly known species. However, in organisms with large genomes, as most amphibians, WGS is still excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome and the transcriptome must be assembledde-novo. We used RNA-seq to obtain the transcriptomic profile forOreobates cruralis, a poorly known South American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results suggested that genes related to immune system and defense mechanisms are abundant in the transcriptome ofO. cruralis. We also present a pipeline to assist with pre-processing, assembling, evaluating and functionally annotating ade-novotranscriptome from RNA-seq data of non-model organisms. Our pipeline guides the inexperienced user in an intuitive way through all the necessary steps to buildde-novotranscriptome assemblies using readily available software and is freely available at:https://github.com/biomendi/TRANSCRIPTOME-ASSEMBLY-PIPELINE/wiki.


2011 ◽  
Vol 54 (12) ◽  
pp. 1129-1133 ◽  
Author(s):  
Geng Chen ◽  
KangPing Yin ◽  
Charles Wang ◽  
TieLiu Shi

2018 ◽  
Author(s):  
Elena Bushmanova ◽  
Dmitry Antipov ◽  
Alla Lapidus ◽  
Andrey D. Prjibelski

AbstractSummaryPossibility to generate large RNA-seq datasets has led to development of various reference-based and de novo transcriptome assemblers with their own strengths and limitations. While reference-based tools are widely used in various transcriptomic studies, their application is limited to the model organisms with finished and annotated genomes. De novo transcriptome reconstruction from short reads remains an open challenging problem, which is complicated by the varying expression levels across different genes, alternative splicing and paralogous genes. In this paper we describe a novel transcriptome assembler called rnaSPAdes, which is developed on top of SPAdes genome assembler and explores surprising computational parallels between assembly of transcriptomes and single-cell genomes. We also present quality assessment reports for rnaSPAdes assemblies, compare it with modern transcriptome assembly tools using several evaluation approaches on various RNA-Seq datasets, and briefly highlight strong and weak points of different assemblers.Availability and implementationrnaSPAdes is implemented in C++ and Python and is freely available at cab.spbu.ru/software/rnaspades/.


Author(s):  
Aojie Lian ◽  
James Guevara ◽  
Kun Xia ◽  
Jonathan Sebat

Abstract Motivation As sequencing technologies and analysis pipelines evolve, de novo mutation (DNM) calling tools must be adapted. Therefore, a flexible approach is needed that can accurately identify DNMs from genome or exome sequences from a variety of datasets and variant calling pipelines. Results Here, we describe SynthDNM, a random-forest based classifier that can be readily adapted to new sequencing or variant-calling pipelines by applying a flexible approach to constructing simulated training examples from real data. The optimized SynthDNM classifiers predict de novo SNPs and indels with robust accuracy across multiple methods of variant calling. Availabilityand implementation SynthDNM is freely available on Github (https://github.com/james-guevara/synthdnm). Supplementary information Supplementary data are available at Bioinformatics online.


2022 ◽  
Vol 12 ◽  
Author(s):  
Sang-Ho Kang ◽  
Woo-Haeng Lee ◽  
Joon-Soo Sim ◽  
Niha Thaku ◽  
Saemin Chang ◽  
...  

Senna occidentalis is an annual leguminous herb that is rich in anthraquinones, which have various pharmacological activities. However, little is known about the genetics of S. occidentalis, particularly its anthraquinone biosynthesis pathway. To broaden our understanding of the key genes and regulatory mechanisms involved in the anthraquinone biosynthesis pathway, we used short RNA sequencing (RNA-Seq) and long-read isoform sequencing (Iso-Seq) to perform a spatial and temporal transcriptomic analysis of S. occidentalis. This generated 121,592 RNA-Seq unigenes and 38,440 Iso-Seq unigenes. Comprehensive functional annotation and classification of these datasets using public databases identified unigene sequences related to major secondary metabolite biosynthesis pathways and critical transcription factor families (bHLH, WRKY, MYB, and bZIP). A tissue-specific differential expression analysis of S. occidentalis and measurement of the amount of anthraquinones revealed that anthraquinone accumulation was related to the gene expression levels in the different tissues. In addition, the amounts and types of anthraquinones produced differ between S. occidentalis and S. tora. In conclusion, these results provide a broader understanding of the anthraquinone metabolic pathway in S. occidentalis.


2019 ◽  
Author(s):  
Xue-ying Zhang ◽  
Xian-zhi Sun ◽  
Sheng Zhang ◽  
Jing-hui Yang ◽  
Fang-fang Liu ◽  
...  

Abstract Abstract Background: Aphid ( Macrosiphoniella sanbourni ) stress drastically influences the yield and quality of chrysanthemum, and grafting has been widely used to improve tolerance to biotic and abiotic stresses. However, the effect of grafting on the resistance of chrysanthemum to aphids remains unclear. Therefore, we used the RNA-Seq platform to perform a de novo transcriptome assembly to analyze the self-rooted grafted chrysanthemum ( Chrysanthemum morifolium T. 'Hangbaiju') and the grafted Artermisia-chrysanthemum (grafted onto Artemisia scoparia W.) transcription response to aphid stress. Results : The results showed that there were 1337 differentially expressed genes (DEGs), among which 680 were upregulated and 667 were downregulated, in the grafted Artemisia-chrysanthemum compared to the self-rooted grafted chrysanthemum. These genes were mainly involved in sucrose metabolism, the biosynthesis of secondary metabolites, the plant hormone signaling pathway and the plant-to-pathogen pathway. KEGG and GO enrichment analyses revealed the coordinated upregulation of these genes from numerous functional categories related to aphid stress responses. In addition, we determined the physiological indicators of chrysanthemum under aphid stress, and the results were consistent with the molecular sequencing results. All evidence indicated that grafting chrysanthemum onto A. scoparia W. upregulated aphid stress responses in chrysanthemum. Conclusion: In summary, our study presents a genome-wide transcript profile of the self-rooted grafted chrysanthemum and the grafted Artemisia-chrysanthemum and provides insights into the molecular mechanisms of C. morifolium T. in response to aphid infestation. These data will contribute to further studies of aphid tolerance and the exploration of new candidate genes for chrysanthemum molecular breeding. Key words : Chrysanthemum, Grafting, Aphid stress, Gene expression, RNA-Seq


2018 ◽  
Vol 16 (1) ◽  
pp. 46-53
Author(s):  
Jonathan Chacon ◽  
Math P. Cuajungco

Background and Purpose: The reduction of cost and ease of using core laboratories or commercial sequencing companies have allowed biomedical and health researchers alike to employ reference-based genomic or transcriptomic sequencing (RNA-seq) projects to expand their work. Non-reference based data analysis, in cases of inexperienced researchers, become more challenging despite the availability of many open source and commercial software programs. Methods: We performed de novo assembly of RNA-seq data obtained from a non-model organism (Eastern Newt skin) to compare data output of two commercially available software workflows. Results: Our results show that the software packages performed satisfactorily albeit with differences in how the annotated and novel transcripts were identified and listed. Conclusion: Overall, we conclude that the use of commercial software platforms has a clear advantage to that of open source programs because of convenience with data analysis workflows. One caveat is that users need to know the software’s basic algorithm and technical approach, in order to determine the precision and validity of the data output. Thus, it is imperative that researchers fully evaluate the software according to their needs to determine their suitability.


Sign in / Sign up

Export Citation Format

Share Document