Insights from the first genome assembly of Onion (Allium cepa)
Onion is an important vegetable crop with an estimated genome size of 16GB. We describe the de novo assembly and ab initio annotation of the genome of a doubled haploid onion line DHCU066619, which resulted in a final assembly of 14.9 Gb with a N50 of 461 Kb. Of which 2.2 Gb was ordered into 8 pseudomolecules using five genetic linkage maps. The remainder of the genome is available in 89.8 K scaffolds. Analysis of this genome shows that at least 72.4% of the genome is repetitive and consists, to a large extent, of (retro) transposons. Many (retro) transposons were already quite old as they had accumulated many mutations, facilitating their assembly, however, hampering their identification. The draft ab initio gene prediction indicated 540 925 putative gene models, which is far more than expected, possibly due to the presence of pseudogenes. 86,073 models showed similarity to published proteins (UNIPROT). No gene rich regions were found, genes are uniformly distributed over the genome. Analysis of synteny with A. sativum (garlic) showed collinearity but also major rearrangements between both species. Not-withstanding, this assembly is the first high-quality draft genome sequence available for the study of onion and will be a valuable resource for further research.