scholarly journals Genomic insights into longan evolution from a chromosome-level genome assembly and population analysis of longan accessions

2021 ◽  
Author(s):  
Jing Wang ◽  
Jianguang Li ◽  
Zaiyuan Li ◽  
Bo Liu ◽  
Lili Zhang ◽  
...  

Longan (Dimocarpus longan) is a subtropical fruit best known for its nutritious fruit and has been regarded as a precious tonic and traditional medicine since ancient times. High-quality chromosome-scale genome assembly is valuable for functional genomic study and genetic improvement of longan. Here, we report a chromosome-level reference genome sequence for longan cultivar JDB with an assembled genome of 455.5 Mb in size anchored to fifteen chromosomes, representing a significant improvement of contiguity (contig N50=12.1 Mb, scaffold N50= 29.5 Mb) over a previous draft assembly. A total of 40,420 protein-coding genes were predicted in D. longan genome. Synteny analysis suggests longan shares the widespread gamma event with core eudicots, but has no other whole genome duplications. Comparative genomics showed that D. longan genome experienced significant expansions of gene families related to phenylpropanoid biosynthesis and UDP-glucosyltransferase. Deep genome sequencing analysis of 87 longan accessions identified longan biogeography as a major contributing factor for genetic diversity, and revealed a clear population admixture and introgression among cultivars of different geographic origins, postulating a likely migration trajectory of longan overall confirmed by existing historical records. The chromosome-level reference genome assembly, annotation and population genetic resource for D. longan will facilitate the molecular studies and breeding of desirable longan cultivars in the future.

2021 ◽  
Vol 12 ◽  
Author(s):  
Jielong Zhou ◽  
Peifu Wu ◽  
Zhongping Xiong ◽  
Naiyong Liu ◽  
Ning Zhao ◽  
...  

A high-quality genome is of significant value when seeking to control forest pests such as Dendrolimus kikuchii, a destructive member of the order Lepidoptera that is widespread in China. Herein, a high quality, chromosome-level reference genome for D. kikuchii based on Nanopore, Pacbio HiFi sequencing and the Hi-C capture system is presented. Overall, a final genome assembly of 705.51 Mb with contig and scaffold N50 values of 20.89 and 24.73 Mb, respectively, was obtained. Of these contigs, 95.89% had unique locations on 29 chromosomes. In silico analysis revealed that the genome contained 15,323 protein-coding genes and 63.44% repetitive sequences. Phylogenetic analyses indicated that D. kikuchii may diverged from the common ancestor of Thaumetopoea. Pityocampa, Thaumetopoea ni, Heliothis virescens, Hyphantria armigera, Spodoptera frugiperda, and Spodoptera litura approximately 122.05 million years ago. Many gene families were expanded in the D. kikuchii genome, particularly those of the Toll and IMD signaling pathway, which included 10 genes in peptidoglycan recognition protein, 19 genes in MODSP, and 11 genes in Toll. The findings from this study will help to elucidate the mechanisms involved in protection of D. kikuchii against foreign substances and pathogens, and may highlight a potential channel to control this pest.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Qingzhen Wei ◽  
Jinglei Wang ◽  
Wuhong Wang ◽  
Tianhua Hu ◽  
Haijiao Hu ◽  
...  

Abstract Eggplant (Solanum melongena L.) is an economically important vegetable crop in the Solanaceae family, with extensive diversity among landraces and close relatives. Here, we report a high-quality reference genome for the eggplant inbred line HQ-1315 (S. melongena-HQ) using a combination of Illumina, Nanopore and 10X genomics sequencing technologies and Hi-C technology for genome assembly. The assembled genome has a total size of ~1.17 Gb and 12 chromosomes, with a contig N50 of 5.26 Mb, consisting of 36,582 protein-coding genes. Repetitive sequences comprise 70.09% (811.14 Mb) of the eggplant genome, most of which are long terminal repeat (LTR) retrotransposons (65.80%), followed by long interspersed nuclear elements (LINEs, 1.54%) and DNA transposons (0.85%). The S. melongena-HQ eggplant genome carries a total of 563 accession-specific gene families containing 1009 genes. In total, 73 expanded gene families (892 genes) and 34 contraction gene families (114 genes) were functionally annotated. Comparative analysis of different eggplant genomes identified three types of variations, including single-nucleotide polymorphisms (SNPs), insertions/deletions (indels) and structural variants (SVs). Asymmetric SV accumulation was found in potential regulatory regions of protein-coding genes among the different eggplant genomes. Furthermore, we performed QTL-seq for eggplant fruit length using the S. melongena-HQ reference genome and detected a QTL interval of 71.29–78.26 Mb on chromosome E03. The gene Smechr0301963, which belongs to the SUN gene family, is predicted to be a key candidate gene for eggplant fruit length regulation. Moreover, we anchored a total of 210 linkage markers associated with 71 traits to the eggplant chromosomes and finally obtained 26 QTL hotspots. The eggplant HQ-1315 genome assembly can be accessed at http://eggplant-hq.cn. In conclusion, the eggplant genome presented herein provides a global view of genomic divergence at the whole-genome level and powerful tools for the identification of candidate genes for important traits in eggplant.


GigaScience ◽  
2020 ◽  
Vol 9 (4) ◽  
Author(s):  
Yunfeng Li ◽  
Lei Gao ◽  
Yongjia Pan ◽  
Meilin Tian ◽  
Yulong Li ◽  
...  

Abstract Background Jellyfish belong to the phylum Cnidaria, which occupies an important phylogenetic location in the early-branching Metazoa lineages. The jellyfish Rhopilema esculentum is an important fishery resource in China. However, the genome resource of R. esculentum has not been reported to date. Findings In this study, we constructed a chromosome-level genome assembly of R. esculentum using Pacific Biosciences, Illumina, and Hi-C sequencing technologies. The final genome assembly was ∼275.42 Mb, with a contig N50 length of 1.13 Mb. Using Hi-C technology to identify the contacts among contigs, 260.17 Mb (94.46%) of the assembled genome were anchored onto 21 pseudochromosomes with a scaffold N50 of 12.97 Mb. We identified 17,219 protein-coding genes, with an average CDS length of 1,575 bp. The genome-wide phylogenetic analysis indicated that R. esculentum might have evolved more slowly than the other scyphozoan species used in this study. In addition, 127 toxin-like genes were identified, and 1 toxin-related “hub” was found by a genomic survey. Conclusions We have generated a chromosome-level genome assembly of R. esculentum that could provide a valuable genomic background for studying the biology and pharmacology of jellyfish, as well as the evolutionary history of Cnidaria.


Author(s):  
Mingcheng Wang ◽  
Lei Zhang ◽  
Zhiqiang Wang

Abstract Jacaranda mimosifolia D. Don is a deciduous tree widely cultivated in the tropics and subtropics of the world. It is famous for its beautiful blue flowers and pinnate compound leaves. In addition, this tree has great potential in environmental monitoring, soil quality improvement, and medicinal applications. However, a genome resource for J. mimosifolia has not been reported to date. In this study, we constructed a chromosome-level genome assembly of J. mimosifolia using PacBio sequencing, Illumina sequencing, and Hi-C technology. The final genome assembly was ∼707.32 Mb in size, 688.76 Mb (97.36%) of which could be grouped into 18 pseudochromosomes, with contig and scaffold N50 values of 16.77 and 39.98 Mb, respectively. A total of 30,507 protein-coding genes were predicted, 95.17% of which could be functionally annotated. Phylogenetic analysis among 12 plant species confirmed the close genetic relationship between J. mimosifolia and Handroanthus impetiginosus. Gene family clustering revealed 481 unique, 103 significantly expanded, and 16 significantly contracted gene families in the J. mimosifolia genome. This chromosome-level genome assembly of J. mimosifolia will provide a valuable genomic resource for elucidating the genetic bases of the morphological characteristics, adaption evolution, and active compounds biosynthesis of J. mimosifolia.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Fanming Meng ◽  
Zhuoying Liu ◽  
Han Han ◽  
Dmitrijs Finkelbergs ◽  
Yangshuai Jiang ◽  
...  

Abstract Background Blowflies (Diptera: Calliphoridae) are the most commonly found entomological evidence in forensic investigations. Distinguished from other blowflies, Aldrichina grahami has some unique biological characteristics and is a species of forensic importance. Its development rate, pattern, and life cycle can provide valuable information for the estimation of the minimum postmortem interval. Findings Herein we provide a chromosome-level genome assembly of A. grahami that was generated by Pacific BioSciences sequencing platform and chromosome conformation capture (Hi-C) technology. A total of 50.15 Gb clean reads of the A. grahami genome were generated. FALCON and Wtdbg were used to construct the genome of A. grahami, resulting in an assembly of 600 Mb and 1,604 contigs with an N50 size of 1.93 Mb. We predicted 12,823 protein-coding genes, 99.8% of which was functionally annotated on the basis of the de novo genome (SRA: PRJNA513084) and transcriptome (SRA: SRX5207346) of A. grahami. According to the co-analysis with 11 other insect species, clustering and phylogenetic reconstruction of gene families were performed. Using Hi-C sequencing, a chromosome-level assembly of 6 chromosomes was generated with scaffold N50 of 104.7 Mb. Of these scaffolds, 96.4% were anchored to the total A. grahami genome contig bases. Conclusions The present study provides a robust genome reference for A. grahami that supplements vital genetic information for nonhuman forensic genomics and facilitates the future research of A. grahami and other necrophagous blowfly species used in forensic medicine.


Author(s):  
Qun-Jie Zhang ◽  
Wei Li ◽  
Kui Li ◽  
Hong Nan ◽  
Cong Shi ◽  
...  

AbstractTea is the oldest and most popular nonalcoholic beverage consumed in the world. It provides abundant secondary metabolites that account for its diverse flavors and health benefits. Here we present the first high-quality chromosome-length reference genome of C. sinensis var. sinensis using long read single-molecule real time (SMRT) sequencing and Hi-C technologies to anchor the ∼2.85-Gb genome assembly into 15 pseudo-chromosomes with a scaffold N50 length of ∼195.68 Mb. We annotated at least 2.17 Gb (∼74.13%) of repetitive sequences and high-confidence prediction of 40,812 protein-coding genes in the ∼2.92-Gb genome assembly. This accurately assembled genome allows us to comprehensively annotate functionally important gene families such as those involved in the biosynthesis of catechins, theanine and caffeine. The contiguous genome assembly provides the first view of the repetitive landscape allowing us to accurately characterize retrotransposon diversity. The large tea tree genome is dominated by a handful of Ty3-gypsy long terminal repeat (LTR) retrotransposon families that recently expanded to high copy numbers. We uncover the latest bursts of numerous non-autonomous LTR retrotransposons that may interfere with the propagation of autonomous retroelements. This reference genome sequence will largely facilitate the improvement of agronomically important traits relevant to the tea quality and production.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Xing Wang ◽  
Yi Zhang ◽  
Yufeng Zhang ◽  
Mingming Kang ◽  
Yuanbo Li ◽  
...  

AbstractEarthworms (Annelida: Crassiclitellata) are widely distributed around the world due to their ancient origination as well as adaptation and invasion after introduction into new habitats over the past few centuries. Herein, we report a 1.2 Gb complete genome assembly of the earthworm Amynthas corticis based on a strategy combining third-generation long-read sequencing and Hi-C mapping. A total of 29,256 protein-coding genes are annotated in this genome. Analysis of resequencing data indicates that this earthworm is a triploid species. Furthermore, gene family evolution analysis shows that comprehensive expansion of gene families in the Amynthas corticis genome has produced more defensive functions compared with other species in Annelida. Quantitative proteomic iTRAQ analysis shows that expression of 147 proteins changed in the body of Amynthas corticis and 16 S rDNA sequencing shows that abundance of 28 microorganisms changed in the gut of Amynthas corticis when the earthworm was incubated with pathogenic Escherichia coli O157:H7. Our genome assembly provides abundant and valuable resources for the earthworm research community, serving as a first step toward uncovering the mysteries of this species, and may provide molecular level indicators of its powerful defensive functions, adaptation to complex environments and invasion ability.


2019 ◽  
Author(s):  
Ryan Bracewell ◽  
Anita Tran ◽  
Kamalakar Chatla ◽  
Doris Bachtrog

ABSTRACTThe Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromere, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.


2020 ◽  
Vol 33 (7) ◽  
pp. 880-883
Author(s):  
Stefan Kusch ◽  
Heba M. M. Ibrahim ◽  
Catherine Zanchetta ◽  
Celine Lopez-Roques ◽  
Cecile Donnadieu ◽  
...  

The fungus Myriosclerotinia sulcatula is a close relative of the notorious polyphagous plant pathogens Botrytis cinerea and Sclerotinia sclerotiorum but exhibits a host range restricted to plants from the Carex genus (Cyperaceae family). To date, there are no genomic resources available for fungi in the Myriosclerotinia genus. Here, we present a chromosome-scale reference genome assembly for M. sulcatula. The assembly contains 24 contigs with a total length of 43.53 Mbp, with scaffold N50 of 2,649.7 kbp and N90 of 1,133.1 kbp. BRAKER-predicted gene models were manually curated using WebApollo, resulting in 11,275 protein-coding genes that we functionally annotated. We provide a high-quality reference genome assembly and annotation for M. sulcatula as a resource for studying evolution and pathogenicity in fungi from the Sclerotiniaceae family.


Sign in / Sign up

Export Citation Format

Share Document