scholarly journals A high-quality chromosome-level genome assembly reveals genetics for important traits in eggplant

2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Qingzhen Wei ◽  
Jinglei Wang ◽  
Wuhong Wang ◽  
Tianhua Hu ◽  
Haijiao Hu ◽  
...  

Abstract Eggplant (Solanum melongena L.) is an economically important vegetable crop in the Solanaceae family, with extensive diversity among landraces and close relatives. Here, we report a high-quality reference genome for the eggplant inbred line HQ-1315 (S. melongena-HQ) using a combination of Illumina, Nanopore and 10X genomics sequencing technologies and Hi-C technology for genome assembly. The assembled genome has a total size of ~1.17 Gb and 12 chromosomes, with a contig N50 of 5.26 Mb, consisting of 36,582 protein-coding genes. Repetitive sequences comprise 70.09% (811.14 Mb) of the eggplant genome, most of which are long terminal repeat (LTR) retrotransposons (65.80%), followed by long interspersed nuclear elements (LINEs, 1.54%) and DNA transposons (0.85%). The S. melongena-HQ eggplant genome carries a total of 563 accession-specific gene families containing 1009 genes. In total, 73 expanded gene families (892 genes) and 34 contraction gene families (114 genes) were functionally annotated. Comparative analysis of different eggplant genomes identified three types of variations, including single-nucleotide polymorphisms (SNPs), insertions/deletions (indels) and structural variants (SVs). Asymmetric SV accumulation was found in potential regulatory regions of protein-coding genes among the different eggplant genomes. Furthermore, we performed QTL-seq for eggplant fruit length using the S. melongena-HQ reference genome and detected a QTL interval of 71.29–78.26 Mb on chromosome E03. The gene Smechr0301963, which belongs to the SUN gene family, is predicted to be a key candidate gene for eggplant fruit length regulation. Moreover, we anchored a total of 210 linkage markers associated with 71 traits to the eggplant chromosomes and finally obtained 26 QTL hotspots. The eggplant HQ-1315 genome assembly can be accessed at http://eggplant-hq.cn. In conclusion, the eggplant genome presented herein provides a global view of genomic divergence at the whole-genome level and powerful tools for the identification of candidate genes for important traits in eggplant.

2020 ◽  
Vol 33 (7) ◽  
pp. 880-883
Author(s):  
Stefan Kusch ◽  
Heba M. M. Ibrahim ◽  
Catherine Zanchetta ◽  
Celine Lopez-Roques ◽  
Cecile Donnadieu ◽  
...  

The fungus Myriosclerotinia sulcatula is a close relative of the notorious polyphagous plant pathogens Botrytis cinerea and Sclerotinia sclerotiorum but exhibits a host range restricted to plants from the Carex genus (Cyperaceae family). To date, there are no genomic resources available for fungi in the Myriosclerotinia genus. Here, we present a chromosome-scale reference genome assembly for M. sulcatula. The assembly contains 24 contigs with a total length of 43.53 Mbp, with scaffold N50 of 2,649.7 kbp and N90 of 1,133.1 kbp. BRAKER-predicted gene models were manually curated using WebApollo, resulting in 11,275 protein-coding genes that we functionally annotated. We provide a high-quality reference genome assembly and annotation for M. sulcatula as a resource for studying evolution and pathogenicity in fungi from the Sclerotiniaceae family.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jielong Zhou ◽  
Peifu Wu ◽  
Zhongping Xiong ◽  
Naiyong Liu ◽  
Ning Zhao ◽  
...  

A high-quality genome is of significant value when seeking to control forest pests such as Dendrolimus kikuchii, a destructive member of the order Lepidoptera that is widespread in China. Herein, a high quality, chromosome-level reference genome for D. kikuchii based on Nanopore, Pacbio HiFi sequencing and the Hi-C capture system is presented. Overall, a final genome assembly of 705.51 Mb with contig and scaffold N50 values of 20.89 and 24.73 Mb, respectively, was obtained. Of these contigs, 95.89% had unique locations on 29 chromosomes. In silico analysis revealed that the genome contained 15,323 protein-coding genes and 63.44% repetitive sequences. Phylogenetic analyses indicated that D. kikuchii may diverged from the common ancestor of Thaumetopoea. Pityocampa, Thaumetopoea ni, Heliothis virescens, Hyphantria armigera, Spodoptera frugiperda, and Spodoptera litura approximately 122.05 million years ago. Many gene families were expanded in the D. kikuchii genome, particularly those of the Toll and IMD signaling pathway, which included 10 genes in peptidoglycan recognition protein, 19 genes in MODSP, and 11 genes in Toll. The findings from this study will help to elucidate the mechanisms involved in protection of D. kikuchii against foreign substances and pathogens, and may highlight a potential channel to control this pest.


Author(s):  
Qiang Yan ◽  
Qiong Wang ◽  
Cheng Xuzhen ◽  
Lixia Wang ◽  
Prakit Somta ◽  
...  

Mungbean (Vigna radiata [L.]) is an important economic crop grown in South, and East Asia. The low contiguity of the current assembly of V. radiata genome has limited its application. Here, we report a high-quality chromosome-scale assembled genome of V. radiata to facilitate the investigation of its genome characteristics and evolution. By combination of Nanopore long reads, Illumina short reads and Hi-C data, we generated a high-quality genome assembly of V. radiata, with 473.67 megabases assembled into 11 chromosomes with contig N50 and scaffold N50 of 11.3 and 42.4 megabases, respectively. A total of 52.8% of the genome was annotated as repetitive sequences, among which LTRs (long terminal repeats) were predominant (33.9%). The genome of V. radiata was predicted to contain 33,924 genes, 32,470 (95.7%) of which could be functionally annotated. Evolutionary analysis revealed an estimated divergence time of V. radiata from its close relative V. angularis of ~11.66 million years ago. In addition, 277 V. radiata specific gene families, 18 positively selected genes were detected and functionally annotated. This high-quality mungbean genome will provide valuable resources for further genetic analysis and crop improvement of mungbean and other legume species.


2021 ◽  
Author(s):  
Nicholas C Carleson ◽  
Caroline M Press ◽  
Niklaus J Grunwald

Phytophthora ramorum is the causal agent of sudden oak death in West Coast forests and currently two clonal lineages, NA1 and EU1, cause epidemics in Oregon forests. Here, we report on two high-quality genomes of individuals belonging to the NA1 and EU1 clonal lineages respectively, using PacBio long-read sequencing. The NA1 strain Pr102, originally isolated from coast live oak in California, is the current reference genome and was previously sequenced independently using either Sanger (P. ramorum v1) or PacBio (P. ramorum v2) technology. The EU1 strain PR-15-019 was obtained from tanoak in Oregon. These new genomes have a total size of 57.5 Mb, with a contig N50 length of ~3.5-3.6 Mb and encode ~15,300 predicted protein-coding genes. Genomes were assembled into 27 and 28 scaffolds with 95% BUSCO scores and are considerably improved relative to the current JGI reference genome with 2,575 or the PacBio genomes with 1,512 scaffolds. These high-quality genomes provide a valuable resource for studying the genetics, evolution, and adaptation of these two clonal lineages.


Author(s):  
Qun-Jie Zhang ◽  
Wei Li ◽  
Kui Li ◽  
Hong Nan ◽  
Cong Shi ◽  
...  

AbstractTea is the oldest and most popular nonalcoholic beverage consumed in the world. It provides abundant secondary metabolites that account for its diverse flavors and health benefits. Here we present the first high-quality chromosome-length reference genome of C. sinensis var. sinensis using long read single-molecule real time (SMRT) sequencing and Hi-C technologies to anchor the ∼2.85-Gb genome assembly into 15 pseudo-chromosomes with a scaffold N50 length of ∼195.68 Mb. We annotated at least 2.17 Gb (∼74.13%) of repetitive sequences and high-confidence prediction of 40,812 protein-coding genes in the ∼2.92-Gb genome assembly. This accurately assembled genome allows us to comprehensively annotate functionally important gene families such as those involved in the biosynthesis of catechins, theanine and caffeine. The contiguous genome assembly provides the first view of the repetitive landscape allowing us to accurately characterize retrotransposon diversity. The large tea tree genome is dominated by a handful of Ty3-gypsy long terminal repeat (LTR) retrotransposon families that recently expanded to high copy numbers. We uncover the latest bursts of numerous non-autonomous LTR retrotransposons that may interfere with the propagation of autonomous retroelements. This reference genome sequence will largely facilitate the improvement of agronomically important traits relevant to the tea quality and production.


2019 ◽  
Author(s):  
Qiuju Xia ◽  
Ru Zhang ◽  
Xuemei Ni ◽  
Lei Pan ◽  
Yangzi Wang ◽  
...  

AbstractAsparagus bean (Vigna. unguiculata ssp. sesquipedialis), known for its very long and tender green pods, is an important vegetable crop broadly grown in the developing countries. Despite its agricultural and economic values, asparagus bean does not have a high-quality genome assembly for breeding novel agronomic traits. In this study, we reported a high-quality 632.8 Mb assembly of asparagus bean based on the whole genome shotgun sequencing strategy. We also generated a high-density linkage map for asparagus bean, which helped anchor 94.42% of the scaffolds into 11 pseudo-chromosomes. A total of 42,609 protein-coding genes and 3,579 non-protein-coding genes were predicted from the assembly. Taken together, these genomic resources of asparagus bean will facilitate the investigation of economically valuable traits in a variety of legume species, so that the cultivation of these plants would help combat the protein and energy malnutrition in the developing world.


Author(s):  
Alaina Shumate ◽  
Aleksey V. Zimin ◽  
Rachel M. Sherman ◽  
Daniela Puiu ◽  
Justin M. Wagner ◽  
...  

AbstractHere we describe the assembly and annotation of the genome of an Ashkenazi individual and the creation of a new, population-specific human reference genome. This genome is more contiguous and more complete than GRCh38, the latest version of the human reference genome, and is annotated with highly similar gene content. The Ashkenazi reference genome, Ash1, contains 2,973,118,650 nucleotides as compared to 2,937,639,212 in GRCh38. Annotation identified 20,157 protein-coding genes, of which 19,563 are >99% identical to their counterparts on GRCh38. Most of the remaining genes have small differences. 40 of the protein-coding genes in GRCh38 are missing from Ash1; however, all of these genes are members of multi-gene families for which Ash1 contains other copies. 11 genes appear on different chromosomes from their homologs in GRCh38. Alignment of DNA sequences from an unrelated Ashkenazi individual to Ash1 identified ~1 million fewer homozygous SNPs than alignment of those same sequences to the more-distant GRCh38 genome, illustrating one of the benefits of population-specific reference genomes.


GigaScience ◽  
2019 ◽  
Vol 8 (8) ◽  
Author(s):  
Lu Wang ◽  
Jinwei Wu ◽  
Xiaomei Liu ◽  
Dandan Di ◽  
Yuhong Liang ◽  
...  

Abstract Background The golden snub-nosed monkey (Rhinopithecus roxellana) is an endangered colobine species endemic to China, which has several distinct traits including a unique social structure. Although a genome assembly for R. roxellana is available, it is incomplete and fragmented because it was constructed using short-read sequencing technology. Thus, important information such as genome structural variation and repeat sequences may be absent. Findings To obtain a high-quality chromosomal assembly for R. roxellana qinlingensis, we used 5 methods: Pacific Bioscience single-molecule real-time sequencing, Illumina paired-end sequencing, BioNano optical maps, 10X Genomics link-reads, and high-throughput chromosome conformation capture. The assembled genome was ∼3.04 Gb, with a contig N50 of 5.72 Mb and a scaffold N50 of 144.56 Mb. This represented a 100-fold improvement over the previously published genome. In the new genome, 22,497 protein-coding genes were predicted, of which 22,053 were functionally annotated. Gene family analysis showed that 993 and 2,745 gene families were expanded and contracted, respectively. The reconstructed phylogeny recovered a close relationship between R. rollexana and Macaca mulatta, and these 2 species diverged ∼13.4 million years ago. Conclusion We constructed a high-quality genome assembly of the Qinling golden snub-nosed monkey; it had superior continuity and accuracy, which might be useful for future genetic studies in this species and as a new standard reference genome for colobine primates. In addition, the updated genome assembly might improve our understanding of this species and could assist conservation efforts.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Haolin Wu ◽  
Tao Ma ◽  
Minghui Kang ◽  
Fandi Ai ◽  
Junlin Zhang ◽  
...  

Abstract Actinidia chinensis (kiwifruit) is a perennial horticultural crop species of the Actinidiaceae family with high nutritional and economic value. Two versions of the A. chinensis genomes have been previously assembled, based mainly on relatively short reads. Here, we report an improved chromosome-level reference genome of A. chinensis (v3.0), based mainly on PacBio long reads and Hi-C data. The high-quality assembled genome is 653 Mb long, with 0.76% heterozygosity. At least 43% of the genome consists of repetitive sequences, and the most abundant long terminal repeats were further identified and account for 23.38% of our novel genome. It has clear improvements in contiguity, accuracy, and gene annotation over the two previous versions and contains 40,464 annotated protein-coding genes, of which 94.41% are functionally annotated. Moreover, further analyses of genetic collinearity revealed that the kiwifruit genome has undergone two whole-genome duplications: one affecting all Ericales families near the K-T extinction event and a recent genus-specific duplication. The reference genome presented here will be highly useful for further molecular elucidation of diverse traits and for the breeding of this horticultural crop, as well as evolutionary studies with related taxa.


Author(s):  
Yun-Xia Luan ◽  
Yingying Cui ◽  
Wan-Jun Chen ◽  
Jianfeng Jin ◽  
Ai-Min Liu ◽  
...  

The collembolan Folsomia candida Willem, 1902, is an important representative soil arthropod that is widely distributed throughout the world and has been frequently used as a test organism in soil ecology and ecotoxicology studies. However, it is questioned as an ideal “standard” because of differences in reproductive modes and cryptic genetic diversity between strains from various geographical origins. In this study, we present two high-quality chromosome-level genomes of F. candida, for the parthenogenetic Danish strain (FCDK, 219.08 Mb, N50 of 38.47 Mb, 25,139 protein-coding genes) and the sexual Shanghai strain (FCSH, 153.09 Mb, N50 of 25.75 Mb, 21,609 protein-coding genes). The seven chromosomes of FCDK are each 25–54% larger than the corresponding chromosomes of FCSH, showing obvious repetitive element expansions and large-scale inversions and translocations but no whole-genome duplication. The strain-specific genes, expanded gene families and genes in nonsyntenic chromosomal regions identified in FCDK are highly related to its broader environmental adaptation. In addition, the overall sequence identity of the two mitogenomes is only 78.2%, and FCDK has fewer strain-specific microRNAs than FCSH. In conclusion, FCDK and FCSH have accumulated independent genetic changes and evolved into distinct species since diverging 10 Mya. Our work shows that F. candida represents a good model of rapidly cryptic speciation. Moreover, it provides important genomic resources for studying the mechanisms of species differentiation, soil arthropod adaptation to soil ecosystems, and Wolbachia-induced parthenogenesis as well as the evolution of Collembola, a pivotal phylogenetic clade between Crustacea and Insecta.


Sign in / Sign up

Export Citation Format

Share Document