scholarly journals High-quality de novo assembly of the Eucommia ulmoides haploid genome provides new insights into evolution and rubber biosynthesis

2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Yun Li ◽  
Hairong Wei ◽  
Jun Yang ◽  
Kang Du ◽  
Jiang Li ◽  
...  

Abstract We report the acquisition of a high-quality haploid chromosome-scale genome assembly for the first time in a tree species, Eucommia ulmoides, which is known for its rubber biosynthesis and medicinal applications. The assembly was obtained by applying PacBio and Hi–C technologies to a haploid that we specifically generated. Compared to the initial genome release, this one has significantly improved assembly quality. The scaffold N50 (53.15 MB) increased 28-fold, and the repetitive sequence content (520 Mb) increased by 158.24 Mb, whereas the number of gaps decreased from 104,772 to 128. A total of 92.87% of the 26,001 predicted protein-coding genes identified with multiple strategies were anchored to the 17 chromosomes. A new whole-genome duplication event was superimposed on the earlier γ paleohexaploidization event, and the expansion of long terminal repeats contributed greatly to the evolution of the genome. The more primitive rubber biosynthesis of this species, as opposed to that in Hevea brasiliensis, relies on the methylerythritol-phosphate pathway rather than the mevalonate pathway to synthesize isoprenyl diphosphate, as the MEP pathway operates predominantly in trans-polyisoprene-containing leaves and central peels. Chlorogenic acid biosynthesis pathway enzymes were preferentially expressed in leaves rather than in bark. This assembly with higher sequence contiguity can foster not only studies on genome structure and evolution, gene mapping, epigenetic analysis and functional genomics but also efforts to improve E. ulmoides for industrial and medical uses through genetic engineering.

2021 ◽  
Author(s):  
Victoria L Sork ◽  
Shawn Cokus ◽  
Sorel T. Fitz-Gibbon ◽  
Alexey V. Zimin ◽  
Daniela Puiu ◽  
...  

The genus Quercus, which emerged ~55 million years ago during globally warm temperatures, diversified into ~450 species. We present a high-quality de novo genome assembly of a California endemic oak, Quercus lobata, revealing features consistent with oak evolutionary success. Effective population size remained large throughout history despite declining since the early Miocene. Analysis of 39,373 mapped protein-coding genes outlined copious duplications consistent with genetic and phenotypic diversity, both by retention of genes created during the ancient γ whole genome hexaploid duplication event and by tandem duplication within families, including the numerous resistance genes and also unexpected candidate genes for an incompatibility system involving multiple non-self-recognition genes. An additional surprising finding is that subcontext-specific patterns of DNA methylation associated with transposable elements reveal broadly-distributed heterochromatin in intergenic regions, similar to grasses (another highly successful taxon). Collectively, these features promote genetic and phenotypic variation that would facilitate adaptability to changing environments.


2018 ◽  
Author(s):  
Christine M. Gault ◽  
Karl A. Kremling ◽  
Edward S. Buckler

AbstractPlant genomes reduce in size following a whole genome duplication event, and one gene in a duplicate gene pair can lose function in absence of selective pressure to maintain duplicate gene copies. Maize and its sister genus, Tripsacum, share a genome duplication event that occurred 5 to 26 million years ago. Because few genomic resources for Tripsacum exist, it is unknown whether Tripsacum grasses and maize have maintained a similar set of genes under purifying selection. Here we present high quality de novo transcriptome assemblies for two species: Tripsacum dactyloides and Tripsacum floridanum. Genes with experimental protein evidence in maize were good candidates for genes under purifying selection in both genera because pseudogenes by definition do not produce protein. We tested whether 15,160 maize genes with protein evidence are resisting gene loss and whether their Tripsacum homologs are also resisting gene loss. Protein-encoding maize transcripts and their Tripsacum homologs have higher GC content, higher gene expression levels, and more conserved expression levels than putatively untranslated maize transcripts and their Tripsacum homologs. These results indicate that gene loss is occurring in a similar fashion in both genera after a shared ancient polyploidy event. The Tripsacum transcriptome assemblies provide a high quality genomic resource that can provide insight into the evolution of maize, an highly valuable crop worldwide.Core ideasMaize genes with protein evidence have higher expression and GC contentTripsacum homologs of maize genes exhibit the same trends as in maizeMaize proteome genes have more highly correlated gene expression with TripsacumExpression dominance for homeologs occurs similarly between maize and TripsacumA similar set of genes may be decaying into pseudogenes in maize and Tripsacum


DNA Research ◽  
2021 ◽  
Vol 28 (5) ◽  
Author(s):  
Fengqi Zang ◽  
Yan Ma ◽  
Xiaolong Tu ◽  
Ping Huang ◽  
Qichao Wu ◽  
...  

Abstract Rosa rugosa is an important shrub with economic, ecological, and pharmaceutical value. A high-quality chromosome-scale genome for R. rugosa sequences was assembled using PacBio and Hi-C technologies. The final assembly genome sequences size was about 407.1 Mb, the contig N50 size was 2.85 Mb, and the scaffold N50 size was 56.6 Mb. More than 98% of the assembled genome sequences were anchored to seven pseudochromosomes (402.9 Mb). The genome contained 37,512 protein-coding genes, with 37,016 genes (98.68%) that were functionally annotated, and 206.67 Mb (50.76%) of the assembled sequences are repetitive sequences. Phylogenetic analyses indicated that R. rugosa diverged from Rosa chinensis ∼6.6 million years ago, and no lineage-specific whole-genome duplication event occurred after divergence from R. chinensis. Chromosome synteny analysis demonstrated highly conserved synteny between R. rugosa and R. chinensis, between R. rugosa and Prunus persica as well. Comparative genome and transcriptome analysis revealed genes related to colour, scent, and environment adaptation. The chromosome-level reference genome provides important genomic resources for molecular-assisted breeding and horticultural comparative genomics research.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Rashmi Jain ◽  
Jerry Jenkins ◽  
Shengqiang Shu ◽  
Mawsheng Chern ◽  
Joel A. Martin ◽  
...  

Abstract Background The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to transform and propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.


2019 ◽  
Vol 12 (1) ◽  
pp. 3635-3646 ◽  
Author(s):  
Arnab Ghosh ◽  
Matthew G Johnson ◽  
Austin B Osmanski ◽  
Swarnali Louha ◽  
Natalia J Bayona-Vásquez ◽  
...  

Abstract Crocodilians are an economically, culturally, and biologically important group. To improve researchers’ ability to study genome structure, evolution, and gene regulation in the clade, we generated a high-quality de novo genome assembly of the saltwater crocodile, Crocodylus porosus, from Illumina short read data from genomic libraries and in vitro proximity-ligation libraries. The assembled genome is 2,123.5 Mb, with N50 scaffold size of 17.7 Mb and N90 scaffold size of 3.8 Mb. We then annotated this new assembly, increasing the number of annotated genes by 74%. In total, 96% of 23,242 annotated genes were associated with a functional protein domain. Furthermore, multiple noncoding functional regions and mappable genetic markers were identified. Upon analysis and overlapping the results of branch length estimation and site selection tests for detecting potential selection, we found 16 putative genes under positive selection in crocodilians, 10 in C. porosus and 6 in Alligator mississippiensis. The annotated C. porosus genome will serve as an important platform for osmoregulatory, physiological, and sex determination studies, as well as an important reference in investigating the phylogenetic relationships of crocodilians, birds, and other tetrapods.


2017 ◽  
Author(s):  
Alex B. Brohammer ◽  
Thomas JY. Kono ◽  
Nathan M. Springer ◽  
Suzanne E. McGaugh ◽  
Candice N. Hirsch

SUMMARYMaize is a diverse paleotetraploid species with widespread presence/absence variation and copy number variation. One mechanism through which presence/absence variation can arise is differential fractionation. Fractionation refers to the loss of duplicate gene pairs from one of the maize subgenomes during diploidization and differential fractionation refers to non-shared gene loss events between individuals. We investigated the prevalence of presence/absence variation resulting from differential fractionation in the syntenic portion of the genome using two whole genome de novo assemblies of the inbred lines B73 and PH207. Between these two genomes, syntenic genes were highly conserved with less than 1% of syntenic genes being subject to differential fractionation. The few variable syntenic genes that were identified are unlikely to contribute to functional phenotypic variation, as there is a significant depletion of these genes in annotated gene sets. In further comparisons of 60 diverse inbred lines, non-syntenic genes were six times more likely to be variable compared to syntenic genes, suggesting that comparisons among additional genome assemblies are not likely to result in the discovery of large-scale presence/absence variation among syntenic genes.SIGNIFICANCE STATEMENTThere is a large amount of presence/absence variation for gene content in maize. One mechanism that has been hypothesized to contribute to this variation is differential fractionation between individuals following the maize whole genome duplication event. Using comparative genomics, with sorghum and rice representing the ancestral state, we observed little evidence of differential fractionation among elite inbred lines and the few differentially fractionated genes identified did not appear to confer functional significance.


2019 ◽  
Author(s):  
Rashmi Jain ◽  
Jerry Jenkins ◽  
Shengqiang Shu ◽  
Mawsheng Chern ◽  
Joel A. Martin ◽  
...  

AbstractHere, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics.


GigaScience ◽  
2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Shubo Jin ◽  
Chao Bian ◽  
Sufei Jiang ◽  
Kai Han ◽  
Yiwei Xiong ◽  
...  

Abstract Background The oriental river prawn, Macrobrachium nipponense, is an economically important shrimp in China. Male prawns have higher commercial value than females because the former grow faster and reach larger sizes. It is therefore important to reveal sex-differentiation and development mechanisms of the oriental river prawn to enable genetic improvement. Results We sequenced 293.3 Gb of raw Illumina short reads and 405.7 Gb of Pacific Biosciences long reads. The final whole-genome assembly of the Oriental river prawn was ∼4.5 Gb in size, with predictions of 44,086 protein-coding genes. A total of 49 chromosomes were determined, with an anchor ratio of 94.7% and a scaffold N50 of 86.8 Mb. A whole-genome duplication event was deduced to have happened 109.8 million years ago. By integration of genome and transcriptome data, 21 genes were predicted as sex-related candidate genes. Conclusion The first high-quality chromosome-level genome assembly of the oriental river prawn was obtained. These genomic data, along with transcriptome sequences, are essential for understanding sex-differentiation and development mechanisms in the oriental river prawn, as well as providing genetic resources for in-depth studies on developmental and evolutionary biology in arthropods.


Gigabyte ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Julia Voelker ◽  
Mervyn Shepherd ◽  
Ramil Mauleon

The economically important Melaleuca alternifolia (tea tree) is the source of a terpene-rich essential oil with therapeutic and cosmetic uses around the world. Tea tree has been cultivated and bred in Australia since the 1990s. It has been extensively studied for the genetics and biochemistry of terpene biosynthesis. Here, we report a high quality de novo genome assembly using Pacific Biosciences and Illumina sequencing. The genome was assembled into 3128 scaffolds with a total length of 362 Mb (N50  = 1.9 Mb), with significantly higher contiguity than a previous assembly (N50  = 8.7 Kb). Using a homology-based, RNA-seq evidence-based and ab initio prediction approach, 37,226 protein-coding genes were predicted. Genome assembly and annotation exhibited high completeness scores of 98.1% and 89.4%, respectively. Sequence contiguity was sufficient to reveal extensive gene order conservation and chromosomal rearrangements in alignments with Eucalyptus grandis and Corymbia citriodora genomes. This new genome advances currently available resources to investigate the genome structure and gene family evolution of M. alternifolia. It will enable further comparative genomic studies in Myrtaceae to elucidate the genetic foundations of economically valuable traits in this crop.


2020 ◽  
Author(s):  
Zeyuan Chen ◽  
Özgül Doğan ◽  
Nadège Guiglielmoni ◽  
Anne Guichard ◽  
Michael Schrödl

AbstractBackgroundThe “Spanish” slug, Arion vulgaris Moquin-Tandon, 1855, is considered to be among the 100 worst pest species in Europe. It is common and invasive to at least northern and eastern parts of Europe, probably benefitting from climate change and the modern human lifestyle. The origin and expansion of this species, the mechanisms behind its outstanding adaptive success and ability to outcompete other land slugs are worth to be explored on a genomic level. However, a high-quality chromosome-level genome is still lacking.FindingsThe final assembly of A. vulgaris was obtained by combining short reads, linked reads, Nanopore long reads, and Hi-C data. The genome assembly size is 1.54 Gb with a contig N50 length of 8.6 Mb. We found a recent expansion of transposable elements (TEs) which results in repetitive sequences accounting for more than 75% of the A. vulgaris genome, which is the highest among all known gastropod species. We identified 32,518 protein coding genes, and 2,763 species specific genes were functionally enriched in response to stimuli, nervous system and reproduction. With 1,237 single-copy orthologs from A. vulgaris and other related mollusks with whole-genome data available, we reconstructed the phylogenetic relationships of gastropods and estimated the divergence time of stylommatophoran land snails (Achatina) and Arion slugs at around 126 million years ago, and confirmed the whole genome duplication event shared by them.ConclusionsTo our knowledge, the A. vulgaris genome is the first land slug genome assembly published to date. The high-quality genomic data will provide valuable genetic resources for further phylogeographic studies of A. vulgaris origin and expansion, invasiveness, as well as molluscan aquatic-land transition and shell formation.


Sign in / Sign up

Export Citation Format

Share Document