De Novo Plant Transcriptome Assembly and Annotation Using Illumina RNA-Seq Reads

Author(s):  
Stephanie C. Kerr ◽  
Federico Gaiti ◽  
Milos Tanurdzic
PLoS ONE ◽  
2015 ◽  
Vol 10 (5) ◽  
pp. e0125722 ◽  
Author(s):  
Yuli Li ◽  
Xiliang Wang ◽  
Tingting Chen ◽  
Fuwen Yao ◽  
Cuiping Li ◽  
...  

PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3702 ◽  
Author(s):  
Santiago Montero-Mendieta ◽  
Manfred Grabherr ◽  
Henrik Lantz ◽  
Ignacio De la Riva ◽  
Jennifer A. Leonard ◽  
...  

Whole genome sequencing (WGS) is a very valuable resource to understand the evolutionary history of poorly known species. However, in organisms with large genomes, as most amphibians, WGS is still excessively challenging and transcriptome sequencing (RNA-seq) represents a cost-effective tool to explore genome-wide variability. Non-model organisms do not usually have a reference genome and the transcriptome must be assembledde-novo. We used RNA-seq to obtain the transcriptomic profile forOreobates cruralis, a poorly known South American direct-developing frog. In total, 550,871 transcripts were assembled, corresponding to 422,999 putative genes. Of those, we identified 23,500, 37,349, 38,120 and 45,885 genes present in the Pfam, EggNOG, KEGG and GO databases, respectively. Interestingly, our results suggested that genes related to immune system and defense mechanisms are abundant in the transcriptome ofO. cruralis. We also present a pipeline to assist with pre-processing, assembling, evaluating and functionally annotating ade-novotranscriptome from RNA-seq data of non-model organisms. Our pipeline guides the inexperienced user in an intuitive way through all the necessary steps to buildde-novotranscriptome assemblies using readily available software and is freely available at:https://github.com/biomendi/TRANSCRIPTOME-ASSEMBLY-PIPELINE/wiki.


2011 ◽  
Vol 54 (12) ◽  
pp. 1129-1133 ◽  
Author(s):  
Geng Chen ◽  
KangPing Yin ◽  
Charles Wang ◽  
TieLiu Shi

2018 ◽  
Author(s):  
Elena Bushmanova ◽  
Dmitry Antipov ◽  
Alla Lapidus ◽  
Andrey D. Prjibelski

AbstractSummaryPossibility to generate large RNA-seq datasets has led to development of various reference-based and de novo transcriptome assemblers with their own strengths and limitations. While reference-based tools are widely used in various transcriptomic studies, their application is limited to the model organisms with finished and annotated genomes. De novo transcriptome reconstruction from short reads remains an open challenging problem, which is complicated by the varying expression levels across different genes, alternative splicing and paralogous genes. In this paper we describe a novel transcriptome assembler called rnaSPAdes, which is developed on top of SPAdes genome assembler and explores surprising computational parallels between assembly of transcriptomes and single-cell genomes. We also present quality assessment reports for rnaSPAdes assemblies, compare it with modern transcriptome assembly tools using several evaluation approaches on various RNA-Seq datasets, and briefly highlight strong and weak points of different assemblers.Availability and implementationrnaSPAdes is implemented in C++ and Python and is freely available at cab.spbu.ru/software/rnaspades/.


2022 ◽  
Vol 12 ◽  
Author(s):  
Sang-Ho Kang ◽  
Woo-Haeng Lee ◽  
Joon-Soo Sim ◽  
Niha Thaku ◽  
Saemin Chang ◽  
...  

Senna occidentalis is an annual leguminous herb that is rich in anthraquinones, which have various pharmacological activities. However, little is known about the genetics of S. occidentalis, particularly its anthraquinone biosynthesis pathway. To broaden our understanding of the key genes and regulatory mechanisms involved in the anthraquinone biosynthesis pathway, we used short RNA sequencing (RNA-Seq) and long-read isoform sequencing (Iso-Seq) to perform a spatial and temporal transcriptomic analysis of S. occidentalis. This generated 121,592 RNA-Seq unigenes and 38,440 Iso-Seq unigenes. Comprehensive functional annotation and classification of these datasets using public databases identified unigene sequences related to major secondary metabolite biosynthesis pathways and critical transcription factor families (bHLH, WRKY, MYB, and bZIP). A tissue-specific differential expression analysis of S. occidentalis and measurement of the amount of anthraquinones revealed that anthraquinone accumulation was related to the gene expression levels in the different tissues. In addition, the amounts and types of anthraquinones produced differ between S. occidentalis and S. tora. In conclusion, these results provide a broader understanding of the anthraquinone metabolic pathway in S. occidentalis.


2019 ◽  
Author(s):  
Xue-ying Zhang ◽  
Xian-zhi Sun ◽  
Sheng Zhang ◽  
Jing-hui Yang ◽  
Fang-fang Liu ◽  
...  

Abstract Abstract Background: Aphid ( Macrosiphoniella sanbourni ) stress drastically influences the yield and quality of chrysanthemum, and grafting has been widely used to improve tolerance to biotic and abiotic stresses. However, the effect of grafting on the resistance of chrysanthemum to aphids remains unclear. Therefore, we used the RNA-Seq platform to perform a de novo transcriptome assembly to analyze the self-rooted grafted chrysanthemum ( Chrysanthemum morifolium T. 'Hangbaiju') and the grafted Artermisia-chrysanthemum (grafted onto Artemisia scoparia W.) transcription response to aphid stress. Results : The results showed that there were 1337 differentially expressed genes (DEGs), among which 680 were upregulated and 667 were downregulated, in the grafted Artemisia-chrysanthemum compared to the self-rooted grafted chrysanthemum. These genes were mainly involved in sucrose metabolism, the biosynthesis of secondary metabolites, the plant hormone signaling pathway and the plant-to-pathogen pathway. KEGG and GO enrichment analyses revealed the coordinated upregulation of these genes from numerous functional categories related to aphid stress responses. In addition, we determined the physiological indicators of chrysanthemum under aphid stress, and the results were consistent with the molecular sequencing results. All evidence indicated that grafting chrysanthemum onto A. scoparia W. upregulated aphid stress responses in chrysanthemum. Conclusion: In summary, our study presents a genome-wide transcript profile of the self-rooted grafted chrysanthemum and the grafted Artemisia-chrysanthemum and provides insights into the molecular mechanisms of C. morifolium T. in response to aphid infestation. These data will contribute to further studies of aphid tolerance and the exploration of new candidate genes for chrysanthemum molecular breeding. Key words : Chrysanthemum, Grafting, Aphid stress, Gene expression, RNA-Seq


2018 ◽  
Vol 16 (1) ◽  
pp. 46-53
Author(s):  
Jonathan Chacon ◽  
Math P. Cuajungco

Background and Purpose: The reduction of cost and ease of using core laboratories or commercial sequencing companies have allowed biomedical and health researchers alike to employ reference-based genomic or transcriptomic sequencing (RNA-seq) projects to expand their work. Non-reference based data analysis, in cases of inexperienced researchers, become more challenging despite the availability of many open source and commercial software programs. Methods: We performed de novo assembly of RNA-seq data obtained from a non-model organism (Eastern Newt skin) to compare data output of two commercially available software workflows. Results: Our results show that the software packages performed satisfactorily albeit with differences in how the annotated and novel transcripts were identified and listed. Conclusion: Overall, we conclude that the use of commercial software platforms has a clear advantage to that of open source programs because of convenience with data analysis workflows. One caveat is that users need to know the software’s basic algorithm and technical approach, in order to determine the precision and validity of the data output. Thus, it is imperative that researchers fully evaluate the software according to their needs to determine their suitability.


2021 ◽  
Author(s):  
R.E. Rivera-Vicéns ◽  
C. Garcia Escudero ◽  
N. Conci ◽  
M. Eitel ◽  
G. Wörheide

AbstractThe use of RNA-Seq data and the generation of de novo transcriptome assemblies have been pivotal for studies in ecology and evolution. This is distinctly true for non-model organisms, where no genome information is available; yet, studies of differential gene expression, DNA enrichment baits design, and phylogenetics can all be accomplished with the data gathered at the transcriptomic level. Multiple tools are available for transcriptome assembly, however, no single tool can provide the best assembly for all datasets. Therefore, a multi assembler approach, followed by a reduction step, is often sought to generate an improved representation of the assembly. To reduce errors in these complex analyses while at the same time attaining reproducibility and scalability, automated workflows have been essential in the analysis of RNA-Seq data. However, most of these tools are designed for species where genome data is used as reference for the assembly process, limiting their use in non-model organisms. We present TransPi, a comprehensive pipeline for de novo transcriptome assembly, with minimum user input but without losing the ability of a thorough analysis. A combination of different model organisms, kmer sets, read lengths, and read quantities were used for assessing the tool. Furthermore, a total of 49 non-model organisms, spanning different phyla, were also analyzed. Compared to approaches using single assemblers only, TransPi produces higher BUSCO completeness percentages, and a concurrent significant reduction in duplication rates. TransPi is easy to configure and can be deployed seamlessly using Conda, Docker and Singularity.


Sign in / Sign up

Export Citation Format

Share Document