scholarly journals Reannotation of the cultivated strawberry genome and establishment of a strawberry genome database

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Tianjia Liu ◽  
Muzi Li ◽  
Zhongchi Liu ◽  
Xiaoyan Ai ◽  
Yongping Li

AbstractCultivated strawberry (Fragaria × ananassa) is an important fruit crop species whose fruits are enjoyed by many worldwide. An octoploid of hybrid origin, the complex genome of this species was recently sequenced, serving as a key reference genome for cultivated strawberry and related species of the Rosaceae family. The current annotation of the F. ananassa genome mainly relies on ab initio predictions and, to a lesser extent, transcriptome data. Here, we present the structure and functional reannotation of the F. ananassa genome based on one PacBio full-length RNA library and ninety-two Illumina RNA-Seq libraries. This improved annotation of the F. ananassa genome, v1.0.a2, comprises a total of 108,447 gene models, with 97.85% complete BUSCOs. The models of 19,174 genes were modified, 360 new genes were identified, and 11,044 genes were found to have alternatively spliced isoforms. Additionally, we constructed a strawberry genome database (SGD) for strawberry gene homolog searching and annotation downloading. Finally, the transcriptome of the receptacles and achenes of F. ananassa at four developmental stages were reanalyzed and qualified, and the expression profiles of all the genes in this annotation are also provided. Together, this study provides an updated annotation of the F. ananassa genome, which will facilitate genomic analyses across the Rosaceae family and gene functional studies in cultivated strawberry.

Forests ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 315
Author(s):  
Hailin Liu ◽  
Xin Han ◽  
Jue Ruan ◽  
Lian Xu ◽  
Bing He

The final size of plant leaves is strictly controlled by environmental and genetic factors, which coordinate cell expansion and cell cycle activity in space and time; however, the regulatory mechanisms of leaf growth are still poorly understood. Ginkgo biloba is a dioecious species native to China with medicinally and phylogenetically important characteristics, and its fan-shaped leaves are unique in gymnosperms, while the mechanism of G. biloba leaf development remains unclear. In this study we studied the transcriptome of G. biloba leaves at three developmental stages using high-throughput RNA-seq technology. Approximately 4167 differentially expressed genes (DEGs) were obtained, and a total of 12,137 genes were structure optimized together with 732 new genes identified. More than 50 growth-related factors and gene modules were identified based on DEG and Weighted Gene Co-expression Network Analysis. These results could remarkably expand the existing transcriptome resources of G. biloba, and provide references for subsequent analysis of ginkgo leaf development.


eLife ◽  
2017 ◽  
Vol 6 ◽  
Author(s):  
Richard J White ◽  
John E Collins ◽  
Ian M Sealy ◽  
Neha Wali ◽  
Christopher M Dooley ◽  
...  

We have produced an mRNA expression time course of zebrafish development across 18 time points from 1 cell to 5 days post-fertilisation sampling individual and pools of embryos. Using poly(A) pulldown stranded RNA-seq and a 3′ end transcript counting method we characterise temporal expression profiles of 23,642 genes. We identify temporal and functional transcript co-variance that associates 5024 unnamed genes with distinct developmental time points. Specifically, a class of over 100 previously uncharacterised zinc finger domain containing genes, located on the long arm of chromosome 4, is expressed in a sharp peak during zygotic genome activation. In addition, the data reveal new genes and transcripts, differential use of exons and previously unidentified 3′ ends across development, new primary microRNAs and temporal divergence of gene paralogues generated in the teleost genome duplication. To make this dataset a useful baseline reference, the data can be browsed and downloaded at Expression Atlas and Ensembl.


Author(s):  
Noriyuki Satoh ◽  
Hitoshi Tominaga ◽  
Masato Kiyomoto ◽  
Kanako Hisata ◽  
Jun Inoue ◽  
...  

Among chordate taxa, the cephalochordates diverged earlier than urochordates and vertebrates; thus, they retain unique, primitive developmental features. In particular, the amphioxus notochord has muscle-like properties, a feature not seen in urochordates or vertebrates. Amphioxus contains two Brachyury genes, Bra1 and Bra2. Bra2 is reportedly expressed in the blastopore, notochord, somites, and tail bud, in contrast to a low level of Bra1 expression only in notochord. To distinguish the expression profiles of the two Brachyury genes at the single-cell level, we carried out single-cell RNA-seq (scRNA-seq) analysis using the amphioxus, Branchiostoma japonicum. This scRNA-seq analysis classified B. japonicum embryonic cells into 15 clusters at developmental stages from midgastrula to early swimming larva. Brachyury was expressed in cells of clusters 4, 5, 8, and 9. We first confirmed that cluster 8 comprises cells that form somites since this cluster specifically expresses four myogenic factor genes. Cluster 9 contains a larger number of cells with high levels of Bra2 expression and a smaller number of cells with Bra1 expression. Simultaneous expression in cluster 9 of tool-kit genes, including FoxA, Goosecoid, and hedgehog, showed that this cluster comprises cells that form the notochord. Expression of Bra2, but not Bra1, in cells of clusters 4 and 5 at the gastrula stage together with expression of Wnt1 and Caudal indicates that clusters 4 and 5 comprise cells of the blastopore, which contiguously form the tail bud. In addition, Hox1, Hox3, and Hox4 were highly expressed in Bra2-expressing clusters 4, 5, 8, and 9 in a temporally coordinated manner, suggesting roles of anterior Hox genes in specification of mesodermal organs, including somites, notochord, and tail bud. This scRNA-seq analysis therefore highlights differences between the two Brachyury genes in relation to embryonic regions in which they are expressed and their levels of expression. Bra2 is the ancestral Brachyury in amphioxus, since expression in the blastopore is shared with other deuterostomes. On the other hand, Bra1 is a duplicate copy and likely evolved a supplementary function in notochord and somite formation in the Branchiostoma lineage.


2017 ◽  
Vol 69 (1) ◽  
pp. 181-190 ◽  
Author(s):  
Yong Peng ◽  
Huiqin Ma ◽  
Shangwu Chen

Lycium ruthenicum Murr., which belongs to the family Solanaceae, is a resource plant for Chinese traditional medicine and nutraceutical foods. In this study, RNA sequencing was applied to obtain raw reads of L. ruthenicum fruit at different stages of ripening, and a de novo assembly of its sequence was performed. Approximately 52.45 million 100-bp paired-end raw reads were generated from the samples by deep RNA-seq analysis. These short reads were assembled to obtain 164814 contigs, and the contigs were assembled into 84968 non-redundant unigenes using the Trinity method. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group and KEGG (Kyoto Encyclopedia of Genes and Genomes)pathway terms. Digital gene expression analysis was applied to compare gene-expression patterns at different fruit developmental stages. These results contribute to existing sequence resources for Lycium spp. during the fruit-ripening stages, which is valuable for further functional studies of genes involved in L. ruthenicum fruit nutraceutical quality.


2019 ◽  
Author(s):  
Michal Levin ◽  
Harel Zalts ◽  
Natalia Mostov ◽  
Tamar Hashimshony ◽  
Itai Yanai

AbstractAlternative polyadenylation (APA) leads to multiple transcripts from the same gene, yet their distinct functional attributes remain largely unknown. Here, we introduce APA-seq to detect the expression levels of APA isoforms from 3’-end RNA-Seq data by exploiting both paired-end reads for gene isoform identification and quantification. Applying APA-seq, we detected the expression levels of APA isoforms from RNA-Seq data of single C. elegans embryos, and studied the patterns of 3’ UTR isoform expression throughout embryogenesis. We found that global changes in APA usage demarcate developmental stages, suggesting a requirement for distinct 3’ UTR isoforms throughout embryogenesis. We distinguished two classes of genes, depending upon the correlation between the temporal profiles of their isoforms: those with highly correlated isoforms (HCI) and those with lowly correlated isoforms (LCI) across time. This led us to hypothesize that variants produced with similar expression profiles may be the product of biological noise, while the LCI variants may be under tighter selection and consequently their distinct 3’ UTR isoforms are more likely to have functional consequences. Supporting this notion, we found that LCI genes have significantly more miRNA binding sites, more correlated expression profiles with those of their targeting miRNAs and a relative lack of correspondence between their transcription and protein abundances. Collectively, our results suggest that a lack of coherence among the regulation of 3’ UTR isoforms is a proxy for selective pressures acting upon APA usage and consequently for their functional relevance.


2020 ◽  
Author(s):  
Maria G. Ivanchenko ◽  
Olivia R. Ozguc ◽  
Stephanie R. Bollmann ◽  
Valerie N. Fraser ◽  
Molly Megraw

AbstractCyclophilin A/DIAGEOTROPICA (DGT) has been linked to auxin-regulated development in tomato and appears to affect multiple developmental pathways. Loss of DGT function results in a pleiotropic phenotype that is strongest in the roots, including shortened roots with no lateral branching. Here, we present an RNA-Seq dataset comparing the gene expression profiles of wildtype (‘Ailsa Craig’) and dgt tissues from three spatially separated developmental stages of the tomato root tip, with three replicates for each tissue and genotype. We also identify differentially expressed genes, provide an initial comparison of genes affected in each genotype and tissue, and provide the pipeline used to analyze the data. Further analysis of this dataset can be used to gain insight into the effects of DGT on various root developmental pathways in tomato.


Genes ◽  
2019 ◽  
Vol 11 (1) ◽  
pp. 1 ◽  
Author(s):  
Yuxuan Fan ◽  
Wei Yang ◽  
Qingxia Yan ◽  
Chunrui Chen ◽  
Jinhua Li

The protease inhibitors (PIs) in plants are involved primarily in defense against pathogens and pests and in response to abiotic stresses. However, information about the PI gene families in tomato (Solanum lycopersicum), one of the most important model plant for crop species, is limited. In this study, in silico analysis identified 55 PI genes and their conserved domains, phylogenetic relationships, and chromosome locations were characterized. According to genetic structure and evolutionary relationships, the PI gene families were divided into seven families. Genome-wide microarray transcription analysis indicated that the expression of SlPI genes can be induced by abiotic (heat, drought, and salt) and biotic (Botrytis cinerea and tomato spotted wilt virus (TSWV)) stresses. In addition, expression analysis using RNA-seq in various tissues and developmental stages revealed that some SlPI genes were highly or preferentially expressed, showing tissue- and developmental stage-specific expression profiles. The expressions of four representative SlPI genes in response to abscisic acid (ABA), salicylic acid (SA), ethylene (Eth), gibberellic acid (GA). and methyl viologen (MV) were determined. Our findings indicated that PI genes may mediate the response of tomato plants to environmental stresses to balance hormone signals. The data obtained here will improve the understanding of the potential function of PI gene and lay a foundation for tomato breeding and transgenic resistance to stresses.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Junjie Shao ◽  
Liqiang Wang ◽  
Xinyue Liu ◽  
Meng Yang ◽  
Haimei Chen ◽  
...  

Abstract Circular RNAs (circRNAs) play important roles in animals, plants, and fungi. However, no circRNAs have been reported in Ganoderma lucidum. Here, we carried out a genome-wide identification of the circRNAs in G.lucidum using RNA-Seq data, and analyzed their features. In total, 250 and 2193 circRNAs were identified from strand-specific RNA-seq data generated from the polyA(−) and polyA(−)/RNase R-treated libraries, respectively. Six of 131 (4.58%) predicted circRNAs were experimentally confirmed. Across three developmental stages, 731 exonic circRNAs (back spliced read counts ≥ 5) and their parent genes were further analyzed. CircRNAs were preferred originating from exons with flanking introns, and the lengths of the flanking intron were longer than those of the control introns. A total of 200 circRNAs were differentially expressed across the three developmental stages of G. lucidum. The expression profiles of 119 (16.3%) exonic circRNAs and their parent genes showed significant positive correlations (r ≥ 0.9, q < 0.01), whereas 226 (30.9%) exonic circRNAs and their parent genes exhibited significant negative correlations (r ≤ −0.9, q < 0.01), in which 53 parent genes are potentially involved in the transcriptional regulation, polysaccharide biosynthesis etc. Our results indicated that circRNAs are present in G. lucidum, with potentially important regulatory roles.


2018 ◽  
Vol 19 (10) ◽  
pp. 3071 ◽  
Author(s):  
Li Wang ◽  
Chengjiang Ruan ◽  
Lingyue Liu ◽  
Wei Du ◽  
Aomin Bao

Yellow horn (Xanthoceras sorbifolium Bunge) is an endemic oil-rich shrub that has been widely cultivated in northern China for bioactive oil production. However, little is known regarding the molecular mechanisms that contribute to oil content in yellow horn. Herein, we measured the oil contents of high- and low-oil yellow horn embryo tissues at four developmental stages and investigated the global gene expression profiles through RNA-seq. The results found that at 40, 54, 68, and 81 days after anthesis, a total of 762, 664, 599, and 124 genes, respectively, were significantly differentially expressed between the high- and low-oil lines. Gene ontology (GO) enrichment analysis revealed some critical GO terms related to oil accumulation, including acyl-[acyl-carrier-protein] desaturase activity, pyruvate kinase activity, acetyl-CoA carboxylase activity, and seed oil body biogenesis. The identified differentially expressed genes also included several transcription factors, such as, AP2-EREBP family members, B3 domain proteins and C2C2-Dof proteins. Several genes involved in fatty acid (FA) biosynthesis, glycolysis/gluconeogenesis, and pyruvate metabolism were also up-regulated in the high-oil line at different developmental stages. Our findings indicate that the higher oil accumulation in high-oil yellow horn could be mostly driven by increased FA biosynthesis and carbon supply, i.e. a source effect.


2018 ◽  
Author(s):  
Stephen J. Bush ◽  
Charity Muriuki ◽  
Mary E. B. McCulloch ◽  
Iseabail L. Farquhar ◽  
Emily L. Clark ◽  
...  

AbstractmRNA-like long non-coding RNAs (lncRNA) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. In many cases, therefore, lncRNA detection by RNA-sequencing (RNA-seq) is compromised by stochastic sampling. To account for this and create a catalogue of ruminant lncRNA, we comparedde novoassembled lncRNA derived from large RNA-seq datasets in transcriptional atlas projects for sheep and goats with previous lncRNA assembled in cattle and human. Few lncRNA could be reproducibly assembled from a single dataset, even with deep sequencing of the same tissues from multiple animals. Furthermore, there was little sequence overlap between lncRNA assembled from pooled RNA-seq data. We combined positional conservation (synteny) with cross-species mapping of candidate lncRNA to identify a consensus set of ruminant lncRNA and then used the RNA-seq data to demonstrate detectable and reproducible expression in each species. The majority of lncRNA were encoded by single exons, and expressed at < 1 TPM. In sheep, 20-30% of lncRNA had expression profiles significantly correlated with neighbouring protein-coding genes, suggesting association with enhancers. Alongside substantially expanding the ruminant lncRNA repertoire, the outcomes of our analysis demonstrate that stochastic sampling can be partly overcome by combining RNA-seq datasets from related species. This has practical implications for the future discovery of lncRNA in other species.


Sign in / Sign up

Export Citation Format

Share Document