scholarly journals The genome sequence of the outbreeding globe artichoke constructed de novo incorporating a phase-aware low-pass sequencing strategy of F1 progeny

2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Davide Scaglione ◽  
Sebastian Reyes-Chin-Wo ◽  
Alberto Acquadro ◽  
Lutz Froenicke ◽  
Ezio Portis ◽  
...  

Abstract Globe artichoke (Cynara cardunculus var. scolymus) is an out-crossing, perennial, multi-use crop species that is grown worldwide and belongs to the Compositae, one of the most successful Angiosperm families. We describe the first genome sequence of globe artichoke. The assembly, comprising of 13,588 scaffolds covering 725 of the 1,084 Mb genome, was generated using ~133-fold Illumina sequencing data and encodes 26,889 predicted genes. Re-sequencing (30×) of globe artichoke and cultivated cardoon (C. cardunculus var. altilis) parental genotypes and low-coverage (0.5 to 1×) genotyping-by-sequencing of 163 F1 individuals resulted in 73% of the assembled genome being anchored in 2,178 genetic bins ordered along 17 chromosomal pseudomolecules. This was achieved using a novel pipeline, SOILoCo (Scaffold Ordering by Imputation with Low Coverage), to detect heterozygous regions and assign parental haplotypes with low sequencing read depth and of unknown phase. SOILoCo provides a powerful tool for de novo genome analysis of outcrossing species. Our data will enable genome-scale analyses of evolutionary processes among crops, weeds and wild species within and beyond the Compositae and will facilitate the identification of economically important genes from related species.

2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Davide Scaglione ◽  
Sebastian Reyes-Chin-Wo ◽  
Alberto Acquadro ◽  
Lutz Froenicke ◽  
Ezio Portis ◽  
...  

2021 ◽  
Author(s):  
Myung-Shin Kim ◽  
Taeyoung Lee ◽  
Jeonghun Baek ◽  
Ji Hong Kim ◽  
Changhoon Kim ◽  
...  

AbstractMassive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences provide a unique opportunity to explore a dispensable genome fraction in the pan-genome of a species. Here, we report the de novo assembly and annotation of Hwangkeum, a popular soybean cultivar in Korea. The assembly was constructed using PromethION nanopore sequencing data and two genetic maps, and was then error-corrected using Illumina short-reads and PacBio SMRT reads. The 933.12 Mb assembly was annotated 79,870 transcripts for 58,550 genes using RNA-Seq data and the public soybean annotation set. Comparison of the Hwangkeum assembly with the Williams 82 soybean reference genome sequence revealed 1.8 million single-nucleotide polymorphisms, 0.5 million indels, and 25 thousand putative structural variants. However, there was no natural megabase-scale chromosomal rearrangement. Incidentally, by adding two novel groups, we found that soybean contains four clearly separated groups of centromeric satellite repeats. Analyses of satellite repeats and gene content suggested that the Hwangkeum assembly is a high-quality assembly. This was further supported by comparison of the marker arrangement of anthocyanin biosynthesis genes and of gene arrangement at the Rsv3 locus. Therefore, the results indicate that the de novo assembly of Hwangkeum is a valuable additional reference genome resource for characterizing traits for the improvement of this important crop species.


2020 ◽  
Vol 10 (10) ◽  
pp. 3557-3564
Author(s):  
Alberto Acquadro ◽  
Ezio Portis ◽  
Danila Valentino ◽  
Lorenzo Barchi ◽  
Sergio Lanteri

Globe artichoke (Cynara cardunculus var. scolymus; 2n2x=34) is cropped largely in the Mediterranean region, being Italy the leading world producer; however, over time, its cultivation has spread to the Americas and China. In 2016, we released the first (v1.0) globe artichoke genome sequence (http://www.artichokegenome.unito.it/). Its assembly was generated using ∼133-fold Illumina sequencing data, covering 725 of the 1,084 Mb genome, of which 526 Mb (73%) were anchored to 17 chromosomal pseudomolecules. Based on v1.0 sequencing data, we generated a new genome assembly (v2.0), obtained from a Hi-C (Dovetail) genomic library, and which improves the scaffold N50 from 126 kb to 44.8 Mb (∼356-fold increase) and N90 from 29 kb to 17.8 Mb (∼685-fold increase). While the L90 of the v1.0 sequence included 6,123 scaffolds, the new v2.0 just 15 super-scaffolds, a number close to the haploid chromosome number of the species. The newly generated super-scaffolds were assigned to pseudomolecules using reciprocal blast procedures. The cumulative size of unplaced scaffolds in v2.0 was reduced of 165 Mb, increasing to 94% the anchored genome sequence. The marked improvement is mainly attributable to the ability of the proximity ligation-based approach to deal with both heterochromatic (e.g.: peri-centromeric) and euchromatic regions during the assembly procedure, which allowed to physically locate low recombination regions. The new high-quality reference genome enhances the taxonomic breadth of the data available for comparative plant genomics and led to a new accurate gene prediction (28,632 genes), thus promoting the map-based cloning of economically important genes.


Genetics ◽  
2014 ◽  
Vol 197 (1) ◽  
pp. 401-404 ◽  
Author(s):  
B. Emma Huang ◽  
Chitra Raghavan ◽  
Ramil Mauleon ◽  
Karl W. Broman ◽  
Hei Leung

2019 ◽  
Author(s):  
Antonis Kioukis ◽  
Vassiliki A. Michalopoulou ◽  
Laura Briers ◽  
Stergios Pirintsos ◽  
David J. Studholme ◽  
...  

AbstractCrop wild relatives contain great levels of genetic diversity, representing an invaluable resource for crop improvement. Many of their traits have the potential to help crops become more resistant and resilient, and adapt to the new conditions that they will experience due to climate change. An impressive global effort occurs for the conservation of various wild crop relatives and facilitates their use in crop breeding for food security.The genus Brassica is listed in Annex I of the International Treaty on Plant Genetic Resources for Food and Agriculture. Brassica oleracea (or wild cabbage) is a species native to coastal southern and western Europe that has become established as an important human food crop plant because of its large reserves stored over the winter in its leaves.Brassica cretica Lam. is a wild relative crop in the brassica group and B. cretica subsp. nivea has been suggested as a separate subspecies. The species B. cretica has been proposed as a potential gene donor to a number of crops in the brassica group, including broccoli, Brussels sprout, cabbage, cauliflower, kale, swede, turnip and oilseed rape.Here, we present the draft de novo genome assemblies of four B. cretica individuals, including two B. cretica subsp. nivea and two B. cretica.De novo assembly of Illumina MiSeq genomic shotgun sequencing data yielded 243,461 contigs totalling 412.5 Mb in length, corresponding to 122 % of the estimated genome size of B. cretica (339 Mb). According to synteny mapping and phylogenetic analysis of conserved genes, B. cretica genome based on our sequence data reveals approximately 30.360 proteins.Furthermore, our demographic analysis based on whole genome data, suggests that distinct populations of B. cretica are not isolated. Our findings suggest that the classification of the B. cretica in distinct subspecies is not supported from the genome sequence data we analyzed.


2019 ◽  
Vol 9 (10) ◽  
pp. 3079-3085 ◽  
Author(s):  
Joshua A. Udall ◽  
Evan Long ◽  
Chris Hanson ◽  
Daojun Yuan ◽  
Thiruvarangan Ramaraj ◽  
...  

Cotton is an agriculturally important crop. Because of its importance, a genome sequence of a diploid cotton species (Gossypium raimondii, D-genome) was first assembled using Sanger sequencing data in 2012. Improvements to DNA sequencing technology have improved accuracy and correctness of assembled genome sequences. Here we report a new de novo genome assembly of G. raimondii and its close relative G. turneri. The two genomes were assembled to a chromosome level using PacBio long-read technology, HiC, and Bionano optical mapping. This report corrects some minor assembly errors found in the Sanger assembly of G. raimondii. We also compare the genome sequences of these two species for gene composition, repetitive element composition, and collinearity. Most of the identified structural rearrangements between these two species are due to intra-chromosomal inversions. More inversions were found in the G. turneri genome sequence than the G. raimondii genome sequence. These findings and updates to the D-genome sequence will improve accuracy and translation of genomics to cotton breeding and genetics.


2019 ◽  
Vol 10 (2) ◽  
pp. 455-466
Author(s):  
Hainan Wu ◽  
Dan Yao ◽  
Yuhua Chen ◽  
Wenguo Yang ◽  
Wei Zhao ◽  
...  

Populus simonii is an important tree in the genus Populus, widely distributed in the Northern Hemisphere and having a long cultivation history. Although this species has ecologically and economically important values, its genome sequence is currently not available, hindering the development of new varieties with wider adaptive and commercial traits. Here, we report a chromosome-level genome assembly of P. simonii using PacBio long-read sequencing data aided by Illumina paired-end reads and related genetic linkage maps. The assembly is 441.38 Mb in length and contain 686 contigs with a contig N50 of 1.94 Mb. With the linkage maps, 336 contigs were successfully anchored into 19 pseudochromosomes, accounting for 90.2% of the assembled genome size. Genomic integrity assessment showed that 1,347 (97.9%) of the 1,375 genes conserved among all embryophytes can be found in the P. simonii assembly. Genomic repeat analysis revealed that 41.47% of the P. simonii genome is composed of repetitive elements, of which 40.17% contained interspersed repeats. A total of 45,459 genes were predicted from the P. simonii genome sequence and 39,833 (87.6%) of the genes were annotated with one or more related functions. Phylogenetic analysis indicated that P. simonii and Populus trichocarpa should be placed in different sections, contrary to the previous classification according to morphology. The genome assembly not only provides an important genetic resource for the comparative and functional genomics of different Populus species, but also furnishes one of the closest reference sequences for identifying genomic variants in an F1 hybrid population derived by crossing P. simonii with other Populus species.


Genes ◽  
2018 ◽  
Vol 9 (8) ◽  
pp. 383 ◽  
Author(s):  
Hyun-Oh Lee ◽  
Ji-Weon Choi ◽  
Jeong-Ho Baek ◽  
Jae-Hyeon Oh ◽  
Sang-Choon Lee ◽  
...  

Platycodon grandiflorus (balloon flower) and Codonopsis lanceolata (bonnet bellflower) are important herbs used in Asian traditional medicine, and both belong to the botanical family Campanulaceae. In this study, we designed and implemented a de novo DNA sequencing and assembly strategy to map the complete mitochondrial genomes of the first two members of the Campanulaceae using low-coverage Illumina DNA sequencing data. We produced a total of 28.9 Gb of paired-end sequencing data from the genomic DNA of P. grandiflorus (20.9 Gb) and C. lanceolata (8.0 Gb). The assembled mitochondrial genome of P. grandiflorus was found to consist of two circular chromosomes; the master circle contains 56 genes, and the minor circle contains 42 genes. The C. lanceolata mitochondrial genome consists of a single circle harboring 54 genes. Using a comparative genome structure and a pattern of repeated sequences, we show that the P. grandiflorus minor circle resulted from a recombination event involving the direct repeats of the master circle. Our dataset will be useful for comparative genomics and for evolutionary studies, and will facilitate further biological and phylogenetic characterization of species in the Campanulaceae.


Sign in / Sign up

Export Citation Format

Share Document