transcriptome assembly
Recently Published Documents


TOTAL DOCUMENTS

731
(FIVE YEARS 347)

H-INDEX

41
(FIVE YEARS 8)

Agronomy ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 184
Author(s):  
Tae-Heon Kim ◽  
Young-Mi Yoon ◽  
Jin-Cheon Park ◽  
Jong-Ho Park ◽  
Kyong-Ho Kim ◽  
...  

Cultivated oat (Avena sativa L.) is an important cereal crop that has captured interest worldwide due to its nutritional properties and associated health benefits. Despite this interest, oat has lagged behind other cereal crops in genome studies and the development of DNA markers due to its large and complex genome. RNA-Seq technology has been widely used for transcriptome analysis, functional gene study, and DNA marker development. In this study, we performed the transcriptome sequencing of 10 oat varieties at the seedling stage using the Illumina platform for the development of DNA markers. In total, 31,187,392~41,304,176 trimmed reads (an average of 34,322,925) were generated from 10 oat varieties. All of the trimmed reads of these varieties were assembled and generated, yielding a total of 128,244 assembled unigenes with an average length of 1071.7 bp and N50 of 1752 bp. According to gene ontology (GO) analysis, 30.7% of unigenes were assigned to the “catalytic activity” of the parent term in the molecular function category. Of the 1273 dCAPS markers developed using 491 genotype-specific SNPs, 30 markers exhibiting polymorphism in 28 oat varieties were finally selected. The transcriptome data of oat varieties could be used for functional studies about the seedling stage of oat and information about sequence variations in DNA marker development. These 30 dCAPS markers will be utilized for oat genetic analysis, cultivar identification, and breeders’ rights protection.


2022 ◽  
Author(s):  
Sagnik Banerjee ◽  
Carson Andorf

Advancement in technology has enabled sequencing machines to produce vast amounts of genetic data, causing an increase in storage demands. Most genomic software utilizes read alignments for several purposes including transcriptome assembly and gene count estimation. Herein we present, ABRIDGE, a state-of-the-art compressor for SAM alignment files offering users both lossless and lossy compression options. This reference-based file compressor achieves the best compression ratio among all compression software ensuring lower space demand and faster file transmission. Central to the software is a novel algorithm that retains non-redundant information. This new approach has allowed ABRIDGE to achieve a compression 16% higher than the second-best compressor for RNA-Seq reads and over 35% for DNA-Seq reads. ABRIDGE also offers users the option to randomly access location without having to decompress the entire file. ABRIDGE is distributed under MIT license and can be obtained from GitHub and docker hub. We anticipate that the user community will adopt ABRIDGE within their existing pipeline encouraging further research in this domain.


Author(s):  
Masanao Sato ◽  
Masahide Seki ◽  
Yutaka Suzuki ◽  
Shoko Ueki

Heterosigma akashiwo is a eukaryotic, cosmopolitan, and unicellular alga (class: Raphidophyceae), and produces fish-killing blooms. There is a substantial scientific and practical interest in its ecophysiological characteristics that determine bloom dynamics and its adaptation to broad climate zones. A well-annotated genomic/genetic sequence information enables researchers to characterize organisms using modern molecular technology. The Chloroplast and the mitochondrial genome sequences and transcriptome sequence assembly (TSA) datasets with limited sizes for H. akashiwo are available in NCBI nucleotide database on December 2021: there is no doubt that more genetic information of the species will greatly enhance the progress of biological characterization of the species. Here, we conducted H. akashiwo RNA sequencing, a de novo transcriptome assembly (NCBI TSA ICRV01) of a large number of high-quality short-read sequences, and the functional annotation of predicted genes. Based on our transcriptome, we confirmed that the organism possesses genes that were predicted to function in phagocytosis, supporting the earlier observations of H. akashiwo bacterivory. Along with its capability for photosynthesis, the mixotrophy of H. akashiwo may partially explain its high adaptability to various environmental conditions. Our study here will provide an important toehold to decipher H. akashiwo ecophysiology at a molecular level.


2022 ◽  
Vol 12 ◽  
Author(s):  
Sang-Ho Kang ◽  
Woo-Haeng Lee ◽  
Joon-Soo Sim ◽  
Niha Thaku ◽  
Saemin Chang ◽  
...  

Senna occidentalis is an annual leguminous herb that is rich in anthraquinones, which have various pharmacological activities. However, little is known about the genetics of S. occidentalis, particularly its anthraquinone biosynthesis pathway. To broaden our understanding of the key genes and regulatory mechanisms involved in the anthraquinone biosynthesis pathway, we used short RNA sequencing (RNA-Seq) and long-read isoform sequencing (Iso-Seq) to perform a spatial and temporal transcriptomic analysis of S. occidentalis. This generated 121,592 RNA-Seq unigenes and 38,440 Iso-Seq unigenes. Comprehensive functional annotation and classification of these datasets using public databases identified unigene sequences related to major secondary metabolite biosynthesis pathways and critical transcription factor families (bHLH, WRKY, MYB, and bZIP). A tissue-specific differential expression analysis of S. occidentalis and measurement of the amount of anthraquinones revealed that anthraquinone accumulation was related to the gene expression levels in the different tissues. In addition, the amounts and types of anthraquinones produced differ between S. occidentalis and S. tora. In conclusion, these results provide a broader understanding of the anthraquinone metabolic pathway in S. occidentalis.


2022 ◽  
Author(s):  
Karl Johan Westrin ◽  
Warren W Kretzschmar ◽  
Olof Emanuelsson

Motivation: Transcriptome assembly from RNA sequencing data in species without a reliable reference genome has to be performed de novo, but studies have shown that de novo methods often have inadequate reconstruction ability of transcript isoforms. This impedes the study of alternative splicing, in particular for lowly expressed isoforms. Result: We present the de novo transcript isoform assembler ClusTrast, which clusters a set of guiding contigs by similarity, aligns short reads to the guiding contigs, and assembles each clustered set of short reads individually. We tested ClusTrast on datasets from six eukaryotic species, and showed that ClusTrast reconstructed more expressed known isoforms than any of the other tested de novo assemblers, at a moderate reduction in precision. An appreciable fraction were reconstructed to at least 95% of their length. We suggest that ClusTrast will be useful for studying alternative splicing in the absence of a reference genome. Availability and implementation: The code and usage instructions are available at https://github.com/karljohanw/clustrast.


Author(s):  
Masanao Sato ◽  
Masahide Seki ◽  
Yutaka Suzuki ◽  
Shoko Ueki

Heterosigma akashiwo is a eukaryotic, cosmopolitan, and unicellular alga (class: Raphidophyceae), and produces fish-killing blooms. There is a substantial scientific and practical interest in its ecophysiological characteristics that determine bloom dynamics and its adaptation to broad climate zones. A well-annotated genomic/genetic sequence information enables researchers to characterize organisms using modern molecular technology. The Chloroplast and the mitochondrial genome sequences and transcriptome sequence assembly (TSA) datasets with limited sizes for H. akashiwo are available in NCBI nucleotide database on December 2021: there is no doubt that more genetic information of the species will greatly enhance the progress of biological characterization of the species. Here, we conducted H. akashiwo RNA sequencing, a de novo transcriptome assembly (NCBI TSA ICRV01) of a large number of high-quality short-read sequences, and the functional annotation of predicted genes. Based on our transcriptome, we confirmed that the organism possesses genes that were predicted to function in phagocytosis, supporting the earlier observations of H. akashiwo bacterivory. Along with its capability for photosynthesis, the mixotrophy of H. akashiwo may partially explain its high adaptability to various environmental conditions. Our study here will provide an important toehold to decipher H. akashiwo ecophysiology at a molecular level.


2021 ◽  
Author(s):  
Thiago Britto-Borges ◽  
Volker Boehm ◽  
Niels H Gehring ◽  
Christoph Dieterich

Alternative splicing is a tightly regulated co- and post-transcriptional process contributing to the transcriptome diversity observed in eukaryotes. Several methods for detecting differential junction usage (DJU) from RNA sequencing (RNA-seq) datasets exist. Yet, efforts to integrate the results from DJU methods are lacking. Here, we present Baltica, a framework that provides workflows for quality control, de novo transcriptome assembly with StringTie2, and currently 4 DJU methods: rMATS, JunctionSeq, Majiq, and LeafCutter. Baltica puts the results from different DJU methods into context by integrating the results at the junction level. We present Baltica using 2 datasets, one containing known artificial transcripts (SIRVs) and the second dataset of paired Illumina and Oxford Nanopore Technologies RNA-seq. The data integration allows the user to compare the performance of the tools and reveals that JunctionSeq outperforms the other methods, in terms of F1 score, for both datasets. Finally, we demonstrate for the first time that meta-classifiers trained on scores of multiple methods outperform classifiers trained on scores of a single method, emphasizing the application of our data integration approach for differential splicing identification. Baltica is available at https://github.com/dieterich-lab/Baltica under MIT license.


2021 ◽  
Author(s):  
Sindhu Agastikumar ◽  
Maheswari Patturaj ◽  
Aghila Samji ◽  
Balasubramanian Aiyer ◽  
Aiswarya Munnusamy ◽  
...  

Abstract The endemic and precious timber Pterocarpus santalinus L. f. (Red sanders) is a drought hardy tree species for conservation in peninsular India due to its high risk of illegal timber harvest. It is only found in Eastern Ghats of India, and has become threatened owing to overexploitation of its valuable timber. The development of genomic resources, particularly simple sequence repeat (SSR) markers, is essential for strict implementation of in situ conservation measures and application of DNA information based red sanders genetic resource management. However, a lack of genomic data and efficient molecular markers limit the study of its spatial and temporal population genetic structure, identification of diversity hotspots and tree improvement. The current study aims at comprehensive molecular characterization of red sanders and the somatic chromosome counts, flow cytometry and EST-SSR analyses. The results revealed that red sanders is diploid with 2n=20 and the 2C genome size was 0.7872 ± 0.0561pg for the first time in this species. A total of 3128 EST-SSRs were detected based on 25,854 de novo assembled unigenes from transcriptome data and primer sets designed for 1953 SSRs. Fifty-nine EST-SSR markers were evaluated for polymorphism in the natural populations of red sanders and 13 were found to be suitable for genetic analysis. Two major transcription factor families bHLH and ERF, responsible for abiotic stress and secondary metabolite synthesis were analysed which would provide the foundation for further research on production of medicinally important biocompounds.


2021 ◽  
Vol 5 (3) ◽  
pp. e202101207
Author(s):  
Julien Prunier ◽  
Alexandra Carrier ◽  
Isabelle Gilbert ◽  
William Poisson ◽  
Vicky Albert ◽  
...  

Rangifer tarandus has experienced recent drastic population size reductions throughout its circumpolar distribution and preserving the species implies genetic diversity conservation. To facilitate genomic studies of the species populations, we improved the genome assembly by combining long read and linked read and obtained a new highly accurate and contiguous genome assembly made of 13,994 scaffolds (L90 = 131 scaffolds). Using de novo transcriptome assembly of RNA-sequencing reads and similarity with annotated human gene sequences, 17,394 robust gene models were identified. As copy number variations (CNVs) likely play a role in adaptation, we additionally investigated these variations among 20 genomes representing three caribou ecotypes (migratory, boreal and mountain). A total of 1,698 large CNVs (length > 1 kb) showing a genome distribution including hotspots were identified. 43 large CNVs were particularly distinctive of the migratory and sedentary ecotypes and included genes annotated for functions likely related to the expected adaptations. This work includes the first publicly available annotation of the caribou genome and the first assembly allowing genome architecture analyses, including the likely adaptive CNVs reported here.


Author(s):  
Meltem Kuruş ◽  
Soheil Akbari ◽  
Doğa Eskier ◽  
Ahmet Bursalı ◽  
Kemal Ergin ◽  
...  

The generation and use of induced pluripotent stem cells (iPSCs) in order to obtain all differentiated adult cell morphologies without requiring embryonic stem cells is one of the most important discoveries in molecular biology. Among the uses of iPSCs is the generation of neuron cells and organoids to study the biological cues underlying neuronal and brain development, in addition to neurological diseases. These iPSC-derived neuronal differentiation models allow us to examine the gene regulatory factors involved in such processes. Among these regulatory factors are long non-coding RNAs (lncRNAs), genes that are transcribed from the genome and have key biological functions in establishing phenotypes, but are frequently not included in studies focusing on protein coding genes. Here, we provide a comprehensive analysis and overview of the coding and non-coding transcriptome during multiple stages of the iPSC-derived neuronal differentiation process using RNA-seq. We identify previously unannotated lncRNAs via genome-guided de novo transcriptome assembly, and the distinct characteristics of the transcriptome during each stage, including differentially expressed and stage specific genes. We further identify key genes of the human neuronal differentiation network, representing novel candidates likely to have critical roles in neurogenesis using coexpression network analysis. Our findings provide a valuable resource for future studies on neuronal differentiation.


Sign in / Sign up

Export Citation Format

Share Document