scholarly journals Construction of a reference transcriptome for the analysis of male sterility in sugi (Cryptomeria japonica D. Don) focusing on MALE STERILITY 1 (MS1)

PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0247180
Author(s):  
Fu-Jin Wei ◽  
Saneyoshi Ueno ◽  
Tokuko Ujino-Ihara ◽  
Maki Saito ◽  
Yoshihiko Tsumura ◽  
...  

Sugi (Cryptomeria japonica D. Don) is an important conifer used for afforestation in Japan. As the genome of this species is 11 Gbps, it is too large to assemble within a short timeframe. Transcriptomics is one approach that can address this deficiency. Here we designed a workflow consisting of three stages to de novo assemble transcriptome using Oases and Trinity. The three transcriptomic stage used were independent assembly, automatic and semi-manual integration, and refinement by filtering out potential contamination. We identified a set of 49,795 cDNA and an equal number of translated proteins. According to the benchmark set by BUSCO, 87.01% of cDNAs identified were complete genes, and 78.47% were complete and single-copy genes. Compared to other full-length cDNA resources collected by Sanger and PacBio sequencers, the extent of the coverage in our dataset was the highest, indicating that these data can be safely used for further studies. When two tissue-specific libraries were compared, there were significant expression differences between male strobili and leaf and bark sets. Moreover, subtle expression difference between male-fertile and sterile libraries were detected. Orthologous genes from other model plants and conifer species were identified. We demonstrated that our transcriptome assembly output (CJ3006NRE) can serve as a reference transcriptome for future functional genomics and evolutionary biology studies.

Author(s):  
Fu-Jin Wei ◽  
Saneyoshi Ueno ◽  
Tokuko Ujino-Ihara ◽  
Maki Saito ◽  
Yoshihiko Tsumura ◽  
...  

AbstractSugi (Cryptomeria japonica D. Don) is an important conifer used for afforestation in Japan. The field of functional genomics is rapidly developing. The genomics of this gymnosperm species is currently being studied. Although its genomic size is 11 Gbps, it is still too large to assemble well within a short period of time. Transcriptomics is the one another approach to address this. Moreover, it is a necessary step in obtaining the complete genomic data. Here we designed a three stages assembling workflow using the de novo transcriptome assembly tools, Oases and Trinity. The three stages in transcriptomics are independent assembly, automatic and semi-automatic integration, and refinement by filtering out potential contamination. We found a set of 49,795 cDNA and an equal number of translated proteins (CJ3006NRE). According to the benchmark of BUSCO, 87.01 % were complete genes, including very high “Complete and single-copy” genes–78.47%. Compared to other full-length cDNA resources, the extent of the coverage in CJ3006NRE suggests that it may be used as the standard for further studies. When two tissue-specific libraries were compared, principal component analysis (PCA) showed that there were significant differences between male strobili and leaf and bark sets. The highest three upregulated transcription factors stood out as orthologs to angiosperms. The identified signature-like domain of the transcription factors demonstrated the accuracy of the assembly. Based on the evaluation of different resources, we demonstrate that our transcriptome assembly output is valuable and useful for further studies in functional genomics and evolutionary biology.


2020 ◽  
Vol 13 (1) ◽  
Author(s):  
J. Fibla ◽  
N. Oromi ◽  
M. Pascual-Pons ◽  
J. L. Royo ◽  
A. Palau ◽  
...  

Abstract Objectives The Brown trout is a salmonid species with a high commercial value in Europe. Life history and spawning behaviour include resident (Salmo trutta m. fario) and migratory (Salmo trutta m. trutta) ecotypes. The main objective is to apply RNA-seq technology in order to obtain a reference transcriptome of two key tissues, brain and muscle, of the riverine trout Salmo trutta m. fario. Having a reference transcriptome of the resident form will complement genomic resources of salmonid species. Data description We generate two cDNA libraries from pooled RNA samples, isolated from muscle and brain tissues of adult individuals of Salmo trutta m. fario, which were sequenced by Illumina technology. Raw reads were subjected to de-novo transcriptome assembly using Trinity, and coding regions were predicted by TransDecoder. A final set of 35,049 non-redundant ORF unigenes were annotated. Tissue differential expression analysis was evaluated by Cuffdiff. A False Discovery Rate (FDR) ≤ 0.01 was considered for significant differential expression, allowing to identify key differentially expressed unigenes. Finally, we have identified SNP variants that will be useful tools for population genomic studies.


2017 ◽  
Author(s):  
Mariana B. Grizante ◽  
Marc Tollis ◽  
Juan J. Rodriguez ◽  
Ofir Levy ◽  
Michael J. Angilletta ◽  
...  

AbstractBackgroundThe eastern fence lizard (Sceloporus undulatus) has been a model species for ecological and evolutionary research. Genomic and transcriptomic resources for this species would promote investigation of genetic mechanisms that underpin plastic responses to environmental stress, such as climate warming. Moreover, such resources would aid comparative studies of complex traits at the molecular level, such as the transition from oviparous to viviparous reproduction, which happened at least four times within Sceloporus.FindingsA de novo transcriptome assembly for Sceloporus undulatus, Sund_v1.0, was generated using over 179 million Illumina reads obtained from three tissues (whole brain, skeletal muscle, and embryo) as well as previously reported liver sequences. The Sund_v1.0 assembly had an average contig length of 782 nucleotides and an E90N50 statistic of 2,550 nucleotides. Comparing S. undulatus transcripts with the benchmarking universal single-copy orthologs (BUSCO) for tetrapod species yielded 97.2% gene representation. A total of 13,422 protein-coding orthologs were identified in comparison to the genome of the green anole lizard, Anolis carolinensis, which is the closest related species with genomic data available.ConclusionsThe multi-tissue transcriptome of S. undulatus is the first for a member of the family Phrynosomatidae, offering an important resource to advance studies of adaptation in this species and genomic research in reptiles.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e1616 ◽  
Author(s):  
David A. Anderson ◽  
Marcus E. Walz ◽  
Ernesto Weil ◽  
Peter Tonellato ◽  
Matthew C. Smith

Climate change-driven coral disease outbreaks have led to widespread declines in coral populations. Early work on coral genomics established that corals have a complex innate immune system, and whole-transcriptome gene expression studies have revealed mechanisms by which the coral immune system responds to stress and disease. The present investigation expands bioinformatic data available to study coral molecular physiology through the assembly and annotation of a reference transcriptome of the Caribbean reef-building coral,Orbicella faveolata. Samples were collected during a warm water thermal anomaly, coral bleaching event and Caribbean yellow band disease outbreak in 2010 in Puerto Rico. Multiplex sequencing of RNA on the Illumina GAIIx platform and de novo transcriptome assembly by Trinity produced 70,745,177 raw short-sequence reads and 32,463O. faveolatatranscripts, respectively. The reference transcriptome was annotated with gene ontologies, mapped to KEGG pathways, and a predicted proteome of 20,488 sequences was generated. Protein families and signaling pathways that are essential in the regulation of innate immunity across Phyla were investigated in-depth. Results were used to develop models of evolutionarily conserved Wnt, Notch, Rig-like receptor, Nod-like receptor, and Dicer signaling.O. faveolatais a coral species that has been studied widely under climate-driven stress and disease, and the present investigation provides new data on the genes that putatively regulate its immune system.


Plants ◽  
2019 ◽  
Vol 8 (11) ◽  
pp. 438
Author(s):  
Yong-Bi Fu ◽  
Pingchuan Li ◽  
Bill Biligetu

Chloroplast (cp) genomics will play an important role in the characterization of crop wild relative germplasm conserved in worldwide gene banks, thanks to the advances in genome sequencing. We applied a multiplexed shotgun sequencing procedure to sequence the cp genomes of 25 Avena species with variable ploidy levels. Bioinformatics analysis of the acquired sequences generated 25 de novo genome assemblies ranging from 135,557 to 136,006 bp. The gene annotations revealed 130 genes and their duplications, along with four to six pseudogenes, for each genome. Little differences in genome structure and gene arrangement were observed across the 25 species. Polymorphism analyses identified 1313 polymorphic sites and revealed an average of 277 microsatellites per genome. Greater nucleotide diversity was observed in the short single-copy region. Genome-wide scanning of selection signals suggested that six cp genes were under positive selection on some amino acids. These research outputs allow for a better understanding of oat cp genomes and evolution, and they form an essential set of cp genomic resources for the studies of oat evolutionary biology and for oat wild relative germplasm characterization.


2020 ◽  
Author(s):  
C. Molitor ◽  
T.J. Kurowski ◽  
P.M. Fidalgo de Almeida ◽  
P. Eerolla ◽  
D.J. Spindlow ◽  
...  

AbstractSolanum sitiens is a self-incompatible wild relative of tomato, characterised by salt and drought resistance traits, with the potential to contribute to crop improvement in cultivated tomato. This species has a distinct morphology, classification and ecotype compared to other stress resistant wild tomato relatives such as S. pennellii and S. chilense. Therefore, the availability of a high-quality reference genome for S. sitiens will facilitate the genetic and molecular understanding of salt and drought resistance. Here, we present a de novo genome and transcriptome assembly for S. sitiens (Accession LA1974). A hybrid assembly strategy was followed using Illumina short reads (∼159X coverage) and PacBio long reads (∼44X coverage), generating a total of ∼262 Gbp of DNA sequence; in addition, ∼2,670 Gbp of BioNano data was obtained. A reference genome of 1,245 Mbp, arranged in 1,481 scaffolds with a N50 of 1,826 Mbp was generated. Genome completeness was estimated at 95% using the Benchmarking Universal Single-Copy Orthologs (BUSCO) and the K-mer Analysis Tool (KAT); this is within the range of current high-quality reference genomes for other tomato wild relatives. Additionally, we identified three large inversions compared to S. lycopersicum, containing several drought resistance related genes, such as beta-amylase 1 and YUCCA7.In addition, ∼63 Gbp of RNA-Seq were generated to support the prediction of 31,164 genes from the assembly, and perform a de novo transcriptome. Some of the protein clusters unique to S. sitiens were associated with genes involved in drought and salt resistance, including GLO1 and FQR1.This first reference genome for S. sitiens will provide a valuable resource to progress QTL studies to the gene level, and will assist molecular breeding to improve crop production in water-limited environments.


2019 ◽  
Author(s):  
Haley C. Glass ◽  
Amanda D. Melin ◽  
Steven M. Vamosi

AbstractBackgroundTetrodotoxin (TTX) is a potent neurotoxin used in anti-predator defense by several aquatic species, including the rough-skinned newt, Taricha granulosa. While several possible biological sources of newt TTX have been investigated, mounting evidence suggests a genetic, endogenous origin. We present here a de novo transcriptome assembly and annotation of dorsal skin samples from the tetrodotoxin-bearing species T. granulosa, to facilitate the study of putative genetic mechanisms of TTX expression.FindingsApproximately 211 million read pairs were assembled into 245,734 transcripts using the Trinity de novo assembly method. Of the assembled transcripts, we were able to annotate 34% by comparing them to databases of sequences with known functions, suggesting that many transcripts are unique to the rough-skinned newt. Our assembly has near-complete sequence information for an estimated 83% of genes based on Benchmarking Universal Single Copy Orthologs. We also utilized other comparative methods to assess the quality of our assembly. The T. granulosa assembly was compared with that of the Japanese fire-belly newt, Cynops pyrrhogaster, and they were found to share a total of 30,556 orthologous sequences (12.9% gene set).ConclusionsWe provide a reference assembly for Taricha granulosa that will enable downstream differential expression and comparative transcriptomics analyses. This publicly available transcriptome assembly and annotation dataset will facilitate the investigation of a wide range of questions concerning amphibian adaptive radiation, and the elucidation of mechanisms of tetrodotoxin defense in Taricha granulosa and other TTX-bearing species.


2016 ◽  
Author(s):  
Jared Mamrot ◽  
Roxane Legaie ◽  
Stacey J Ellery ◽  
Trevor Wilson ◽  
David K. Gardner ◽  
...  

AbstractBackground: Spiny mice of the genus Acomys are small desert-dwelling rodents that display physiological characteristics not typically found in rodents. Recent investigations have reported a menstrual cycle and scar free-wound healing in this species; characteristics that are exceedingly rare in mammals, and of considerable interest to the scientific community. These unique physiological traits, and the potential for spiny mice to accurately model human diseases, are driving increased use of this genus in biomedical research. However, little genetic information is currently available for Acomys, limiting the application of some modern investigative techniques. This project aimed to generate a reference transcriptome assembly for the common spiny mouse (Acomys cahirinus).Results: Illumina RNA sequencing of male and female spiny mice produced 451 million, 150bp paired-end reads from 15 organ types. An extensive survey of de novo transcriptome assembly approaches of high-quality reads using Trinity, SOAPdenovo-Trans, and Velvet/Oases at multiple kmer lengths was conducted with 49 single-kmer assemblies generated from this dataset, with and without in silico normalization and probabilistic error correction. Merging transcripts from 49 individual single-kmer assemblies into a single meta-assembly of non-redundant transcripts using the EvidentialGene ‘tr2aacds’ pipeline produced the highest quality transcriptome assembly, comprised of 880,080 contigs, of which 189,925 transcripts were annotated using the SwissProt/Uniprot database.Conclusions: This study provides the first detailed characterization of the spiny mouse transcriptome. It validates the application of the EvidentialGene ‘tr2aacds’ pipeline to generate a high-quality reference transcriptome assembly in a mammalian species, and provides a valuable scientific resource for further investigation into the unique physiological characteristics inherent in the genus Acomys.


2020 ◽  
Author(s):  
Marisaldi Luca ◽  
Basili Danilo ◽  
Gioacchini Giorgia ◽  
Carnevali Oliana

AbstractOver the last two decades, many efforts have been invested in attempting to close the life cycle of the iconic Atlantic bluefin tuna (Thunnus thynnus) and develop a true aquaculture-based market. However, the limited molecular resources nowadays available represent a clear limitation towards the domestication of this species. To fill such a gap of knowledge, we assembled and characterized a de novo larval transcriptome by taking advantage of publicly available databases with the final goal of better understanding the larval development. The assembled transcriptome comprised 37,117 protein-coding transcripts, of which 13,633 full-length (>80% coverage), with an Ex90N50 of 3,061 bp and 76% of complete and single-copy core vertebrate genes orthologues. Of these transcripts, 34,980 had a hit against the EggNOG database and 14,983 with the KAAS annotation server. By comparing our data with a set of representative fish species proteomes, it was found that 78.4% of the tuna transcripts were successfully included in orthologous groups. Codon usage bias was identified for processes such as translation, peptide biosynthesis, muscle development and ion transport, supporting the idea of mechanisms at play in regulating stability and translation efficiency of transcripts belonging to key biological processes during the larval growth. The information generated by this study on the Atlantic bluefin tuna represent a relevant improvement of the transcriptomic resources available to the scientific community and lays the foundation for future works aimed at exploring in greater detail physiological responses at molecular level in different larval stages.


Sign in / Sign up

Export Citation Format

Share Document