scholarly journals Siberian sturgeon multi-tissue reference transcriptome database

Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Christophe Klopp ◽  
Cédric Cabau ◽  
Gonzalo Greif ◽  
André Lasalle ◽  
Santiago Di Landro ◽  
...  

Abstract Motivation: Siberian sturgeon is a long lived and late maturing fish farmed for caviar production in 50 countries. Functional genomics enable to find genes of interest for fish farming. In the absence of a reference genome, a reference transcriptome is very useful for sequencing based functional studies. Results: We present here a high-quality transcriptome assembly database built using RNA-seq reads coming from brain, pituitary, gonadal, liver, stomach, kidney, anterior kidney, heart, embryonic and pre-larval tissues. It will facilitate crucial research on topics such as puberty, reproduction, growth, food intake and immunology. This database represents a major contribution to the publicly available sturgeon transcriptome reference datasets. Availability: The database is publicly available at http://siberiansturgeontissuedb.sigenae.org Supplementary information:  Supplementary data are available at Database online.

2021 ◽  
Author(s):  
Wenbin Guo ◽  
Max Coulter ◽  
Robbie Waugh ◽  
Runxuan Zhang

High quality transcriptome assembly using short reads from RNA-seq data still heavily relies upon reference-based approaches, of which the primary step is to align RNA-seq reads to a single reference genome of haploid sequence. However, it is increasingly apparent that while different genotypes within a species share core genes, they also contain variable numbers of specific genes that are only present a subset of individuals. Using a common reference may thus lead to a loss of genotype-specific information in the assembled transcript dataset and the generation of erroneous, incomplete or misleading transcriptomics analysis results. With the recent development of pan-genome information in many species, it is important that we understand the limitations of single genotype references for transcriptomics analysis. In this study, we quantitively evaluated the advantages of using genotype-specific reference genomes for transcriptome assembly and analysis using cultivated barley as a model. We mapped barley cultivar Barke RNA-seq reads to the Barke genome and to the cultivar Morex genome (common barley genome reference) to construct a genotype specific Reference Transcript Dataset (sRTD) and a common Reference Transcript Datasets (cRTD), respectively. We compared the two RTDs according to their transcript diversity, transcript sequence and structure similarity and the accuracy they provided for transcript quantification and differential expression analysis. Our evaluation shows that the sRTD has a significantly higher diversity of transcripts and alternative splicing events. Despite using a high-quality reference genome for assembly of the cRTD, we miss ca. 40% transcripts present in the sRTD and cRTD only has ca. 70% true assemblies. We found that the sRTD is more accurate for transcript quantification as well as differential expression and differential alternative splicing analysis. However, gene level quantification and comparative expression analysis are less affected by the source RTD, which indicates that analysing transcriptomic data at the gene level may be a reasonable compromise when a high-quality genotype-specific reference is not available.


2021 ◽  
Vol 10 (21) ◽  
Author(s):  
Jason E. Stajich ◽  
Andrea L. Vu ◽  
Howard S. Judelson ◽  
Gregory M. Vogel ◽  
Michael A. Gore ◽  
...  

The oomycete Phytophthora capsici is a destructive pathogen of a wide range of vegetable hosts, especially peppers and cucurbits. A 94.17-Mb genome assembly was constructed using PacBio and Illumina data and annotated with support from transcriptome sequencing (RNA-Seq) reads.


2019 ◽  
Author(s):  
Jing Bing ◽  
Yunhe Ling ◽  
Peipei An ◽  
Enshi Xiao ◽  
Chunlian Li ◽  
...  

Abstract Background Silverleaf sunflower, Helianthus argophyllus , is one of the most important wild species that have been usually used for the improvement of cultivated sunflower. Although a reference genome is now available for the cultivated species, H. annuus , its effect in helping understanding the mechanisms underlying the traits of H. argophyllus is limited by the substantial genomic variance between these two species.Results In this study, we generated a high-quality reference transcriptome of H. argophyllus using Iso-seq strategy. This assembly contains 50,153 unique genes covering more than 91% of the whole genes. Among them, we find 205 genes that are absent in the cultivated species and 475 fusion genes containing components of coding or non-coding sequences from the genome of H. annuus . It is interesting that in line with the strong disease resistance observed for H. argophyllus , these H. argophyllus -specific genes are predominantly related to functions of resistance. We have also profiled the gene expressions in leaf and root under normal or salt stressed conditions and, as a result, find distinct transcriptomic responses to salt stress in leaf and root. Particularly, genes involved in several critical processes including the synthesis and metabolism of glutamate and carbohydrate transport are reversely regulated in leaf and root.Conclusions Overall, this study provided insights into the genomic mechanisms underlying the disease resistance and salt tolerance of silverleaf sunflower and the transcriptome assembly and the genes identified in this study can serve as a complement data resources for future research and breeding programs of sunflowers.


2011 ◽  
Vol 29 (7) ◽  
pp. 644-652 ◽  
Author(s):  
Manfred G Grabherr ◽  
Brian J Haas ◽  
Moran Yassour ◽  
Joshua Z Levin ◽  
Dawn A Thompson ◽  
...  

2020 ◽  
Author(s):  
C. Molitor ◽  
T.J. Kurowski ◽  
P.M. Fidalgo de Almeida ◽  
P. Eerolla ◽  
D.J. Spindlow ◽  
...  

AbstractSolanum sitiens is a self-incompatible wild relative of tomato, characterised by salt and drought resistance traits, with the potential to contribute to crop improvement in cultivated tomato. This species has a distinct morphology, classification and ecotype compared to other stress resistant wild tomato relatives such as S. pennellii and S. chilense. Therefore, the availability of a high-quality reference genome for S. sitiens will facilitate the genetic and molecular understanding of salt and drought resistance. Here, we present a de novo genome and transcriptome assembly for S. sitiens (Accession LA1974). A hybrid assembly strategy was followed using Illumina short reads (∼159X coverage) and PacBio long reads (∼44X coverage), generating a total of ∼262 Gbp of DNA sequence; in addition, ∼2,670 Gbp of BioNano data was obtained. A reference genome of 1,245 Mbp, arranged in 1,481 scaffolds with a N50 of 1,826 Mbp was generated. Genome completeness was estimated at 95% using the Benchmarking Universal Single-Copy Orthologs (BUSCO) and the K-mer Analysis Tool (KAT); this is within the range of current high-quality reference genomes for other tomato wild relatives. Additionally, we identified three large inversions compared to S. lycopersicum, containing several drought resistance related genes, such as beta-amylase 1 and YUCCA7.In addition, ∼63 Gbp of RNA-Seq were generated to support the prediction of 31,164 genes from the assembly, and perform a de novo transcriptome. Some of the protein clusters unique to S. sitiens were associated with genes involved in drought and salt resistance, including GLO1 and FQR1.This first reference genome for S. sitiens will provide a valuable resource to progress QTL studies to the gene level, and will assist molecular breeding to improve crop production in water-limited environments.


2016 ◽  
Author(s):  
Jared Mamrot ◽  
Roxane Legaie ◽  
Stacey J Ellery ◽  
Trevor Wilson ◽  
David K. Gardner ◽  
...  

AbstractBackground: Spiny mice of the genus Acomys are small desert-dwelling rodents that display physiological characteristics not typically found in rodents. Recent investigations have reported a menstrual cycle and scar free-wound healing in this species; characteristics that are exceedingly rare in mammals, and of considerable interest to the scientific community. These unique physiological traits, and the potential for spiny mice to accurately model human diseases, are driving increased use of this genus in biomedical research. However, little genetic information is currently available for Acomys, limiting the application of some modern investigative techniques. This project aimed to generate a reference transcriptome assembly for the common spiny mouse (Acomys cahirinus).Results: Illumina RNA sequencing of male and female spiny mice produced 451 million, 150bp paired-end reads from 15 organ types. An extensive survey of de novo transcriptome assembly approaches of high-quality reads using Trinity, SOAPdenovo-Trans, and Velvet/Oases at multiple kmer lengths was conducted with 49 single-kmer assemblies generated from this dataset, with and without in silico normalization and probabilistic error correction. Merging transcripts from 49 individual single-kmer assemblies into a single meta-assembly of non-redundant transcripts using the EvidentialGene ‘tr2aacds’ pipeline produced the highest quality transcriptome assembly, comprised of 880,080 contigs, of which 189,925 transcripts were annotated using the SwissProt/Uniprot database.Conclusions: This study provides the first detailed characterization of the spiny mouse transcriptome. It validates the application of the EvidentialGene ‘tr2aacds’ pipeline to generate a high-quality reference transcriptome assembly in a mammalian species, and provides a valuable scientific resource for further investigation into the unique physiological characteristics inherent in the genus Acomys.


2018 ◽  
Author(s):  
Jesse Kerkvliet ◽  
Arthur de Fouchier ◽  
Michiel van Wijk ◽  
Astrid T. Groot

AbstractTranscriptome quality control is an important step in RNA-seq experiments. However, the quality of de novo assembled transcriptomes is difficult to assess, due to the lack of reference genome to compare the assembly to. We developed a method to assess and improve the quality of de novo assembled transcriptomes by focusing on the removal of chimeric sequences. These chimeric sequences can be the result of faulty assembled contigs, merging two transcripts into one. The developed method is incorporated into a pipeline, that we named Bellerophon, which is broadly applicable and easy to use. Bellerophon first uses the quality-assessment tool TransRate to indicate the quality, after which it uses a Transcripts Per Million (TPM) filter to remove lowly expressed contigs and CD-HIT-EST to remove highly identical contigs. To validate the quality of this method, we performed three benchmark experiments: 1) a computational creation of chimeras, 2) identification of chimeric contigs in a transcriptome assembly, 3) a simulated RNAseq experiment using a known reference transcriptome. Overall, the Bellerophon pipeline was able to remove between 40 to 91.9% of the chimeras in transcriptome assemblies and removed more chimeric than non-chimeric contigs. Thus, the Bellerophon sequence of filtration steps is a broadly applicable solution to improve transcriptome assemblies.


GigaScience ◽  
2019 ◽  
Vol 8 (9) ◽  
Author(s):  
Yongxin Li ◽  
Yandong Ren ◽  
Dongru Zhang ◽  
Hui Jiang ◽  
Zhongkai Wang ◽  
...  

Abstract Background The mustache toad, Vibrissaphora ailaonica, is endemic to China and belongs to the Megophryidae family. Like other mustache toad species, V. ailaonica males temporarily develop keratinized nuptial spines on their upper jaw during each breeding season, which fall off at the end of the breeding season. This feature is likely result of the reversal of sexual dimorphism in body size, with males being larger than females. A high-quality reference genome for the mustache toad would be invaluable to investigate the genetic mechanism underlying these repeatedly developing keratinized spines. Findings To construct the mustache toad genome, we generated 225 Gb of short reads and 277 Gb of long reads using Illumina and Pacific Biosciences (PacBio) sequencing technologies, respectively. Sequencing data were assembled into a 3.53-Gb genome assembly, with a contig N50 length of 821 kb. We also used high-throughput chromosome conformation capture (Hi-C) technology to identify contacts between contigs, then assembled contigs into scaffolds and assembled a genome with 13 chromosomes and a scaffold N50 length of 412.42 Mb. Based on the 26,227 protein-coding genes annotated in the genome, we analyzed phylogenetic relationships between the mustache toad and other chordate species. The mustache toad has a relatively higher evolutionary rate and separated from a common ancestor of the marine toad, bullfrog, and Tibetan frog 206.1 million years ago. Furthermore, we identified 201 expanded gene families in the mustache toad, which were mainly enriched in immune pathway, keratin filament, and metabolic processes. Conclusions Using Illumina, PacBio, and Hi-C technologies, we constructed the first high-quality chromosome-level mustache toad genome. This work not only offers a valuable reference genome for functional studies of mustache toad traits but also provides important chromosomal information for wider genome comparisons.


2021 ◽  
Author(s):  
Guangcai Liang ◽  
Jia Chang ◽  
Tung On Yau ◽  
Xin Li ◽  
Bingjun He ◽  
...  

In the present study, we performed precise annotation of Drosophila melanogaster, D. simulans, D. grimshawi, Bactrocera oleae mitochondrial (mt) genomes by pan RNA-seq analysis. Our new annotations corrected or modified some of the previous annotations and two important findings were reported for the first time, including the discovery of the conserved polyA(+) and polyA(-) motifs in the control regions (CRs) of insect mt genomes and the adding of CCAs to the 3' ends of two antisense tRNAs in D. melanogaster mt genome. Using PacBio cDNA-seq data from D. simulans, we precisely annotated the Transcription Initiation Sites (TISs) of the mt Heavy and Light strands in Drosophila mt genomes and reported that the polyA(+) and polyA(-) motifs in the CRs are associated with TISs. The discovery of the conserved polyA(+) and polyA(-) motifs provides insights into many polyA and polyT sequences in CRs of insect mt genomes, leading to reveal the mt transcription and its regulation in invertebrates. In addition, we provided a high-quality, well-curated and precisely annotated D. simulans mt genome (GenBank: MN611461), which should be included into the NCBI RefSeq database to replace the current reference genome NC_005781.


Sign in / Sign up

Export Citation Format

Share Document