scholarly journals One Major Challenge of Sequencing Large Plant Genomes Is to Know How Big They Really Are

2018 ◽  
Vol 19 (11) ◽  
pp. 3554 ◽  
Author(s):  
Jaroslav Doležel ◽  
Jana Čížková ◽  
Hana Šimková ◽  
Jan Bartoš

Any project seeking to deliver a plant or animal reference genome sequence must address the question as to the completeness of the assembly. Given the complexity introduced particularly by the presence of sequence redundancy, a problem which is especially acute in polyploid genomes, this question is not an easy one to answer. One approach is to use the sequence data, along with the appropriate computational tools, the other is to compare the estimate of genome size with an experimentally measured mass of nuclear DNA. The latter requires a reference standard in order to provide a robust relationship between the two independent measurements of genome size. Here, the proposal is to choose the human male leucocyte genome for this standard: its 1C DNA amount (the amount of DNA contained within unreplicated haploid chromosome set) of 3.50 pg is equivalent to a genome length of 3.423 Gbp, a size which is just 5% longer than predicted by the most current human genome assembly. Adopting this standard, this paper assesses the completeness of the reference genome assemblies of the leading cereal crops species wheat, barley and rye.

2020 ◽  
Vol 15 ◽  
Author(s):  
Liaofu Luo ◽  
Lirong Zhang

Aims: The discontinuous pattern of genome size variation in angiosperms is an unsolved problem related to genome evolution. We introduce a genome evolution operator and solve the related eigen-value equation to deduce the discontinuous pattern. Background: Genome is a well-defined system for studying evolution of species. One of the basic problems is the genome size evolution. The DNA amounts for angiosperm species are highly variable differing over 1000-fold. One big surprise is the discovery of the discontinuous distribution of nuclear DNA amounts in many angiosperm genera. Objective: The discontinuous distribution of nuclear DNA amounts have certain regularity much like a group of quantum states in atomic physics. The quantum pattern has not been explained by all the evolutionary theories so far and we shall interpret it through the quantum simulation of genome evolution. Methods: We have introduced a genome evolution operator H to deduce the distribution of DNA amount. The nuclear DNA amount in angiosperms is studied from the eigen-value equation of the genome evolution operator H. The operator H is introduced by physical simulation and it is defined as a function of the genome size N and the derivative with respective to the size. Results: The discontinuity of DNA size distribution and its synergetic occurrence in related angiosperms species are successfully deduced from the solution of the equation. The results agree well with the existing experimental data of Aloe, Clarkia, Nicotiana, Lathyrus, Allium and other genera. Conclusion: The success of our approach may infer the existence of a set of genomic evolutionary equations satisfying classical – quantum duality. The classical phase of evolution means it obeying classical deterministic law, while the quantum phase means it obeying quantum stochastic law. The discontinuity of DNA size distribution provides fresh evidence on the quantum evolution of angiosperms. People realize that the discontinuous pattern is due to the existence of some unknown evolutionary constrains. However, our study indicates that these constrains on angiosperm genome are essentially of quantum origin.


2020 ◽  
Vol 9 (37) ◽  
Author(s):  
Samuel O’Donnell ◽  
Frederic Chaux ◽  
Gilles Fischer

ABSTRACT The current Chlamydomonas reinhardtii reference genome remains fragmented due to gaps stemming from large repetitive regions. To overcome the vast majority of these gaps, publicly available Oxford Nanopore Technology data were used to create a new reference-quality de novo genome assembly containing only 21 contigs, 30/34 telomeric ends, and a genome size of 111 Mb.


2019 ◽  
Vol 20 (10) ◽  
pp. 2483 ◽  
Author(s):  
Veronika Kapustová ◽  
Zuzana Tulpová ◽  
Helena Toegelová ◽  
Petr Novák ◽  
Jiří Macas ◽  
...  

Reference genomes of important cereals, including barley, emmer wheat and bread wheat, were released recently. Their comparison with genome size estimates obtained by flow cytometry indicated that the assemblies represent not more than 88–98% of the complete genome. This work is aimed at identifying the missing parts in two cereal genomes and proposing techniques to make the assemblies more complete. We focused on tandemly organised repetitive sequences, known to be underrepresented in genome assemblies generated from short-read sequence data. Our study found arrays of three tandem repeats with unit sizes of 1242 to 2726 bp present in the bread wheat reference genome generated from short reads. However, this and another wheat genome assembly employing long PacBio reads failed in integrating correctly the 2726-bp repeat in the pseudomolecule context. This suggests that tandem repeats of this size, frequently incorporated in unassigned scaffolds, may contribute to shrinking of pseudomolecules without reducing size of the entire assembly. We demonstrate how this missing information may be added to the pseudomolecules with the aid of nanopore sequencing of individual BAC clones and optical mapping. Using the latter technique, we identified and localised a 470-kb long array of 45S ribosomal DNA absent from the reference genome of barley.


Genome ◽  
2007 ◽  
Vol 50 (11) ◽  
pp. 1029-1037 ◽  
Author(s):  
T. Eilam ◽  
Y. Anikster ◽  
E. Millet ◽  
J. Manisterski ◽  
O. Sagi-Assif ◽  
...  

One of the intriguing issues concerning the dynamics of plant genomes is the occurrence of intraspecific variation in nuclear DNA amount. The aim of this work was to assess the ranges of intraspecific, interspecific, and intergeneric variation in nuclear DNA content of diploid species of the tribe Triticeae (Poaceae) and to examine the relation between life form or habitat and genome size. Altogether, 438 plants representing 272 lines that belong to 22 species were analyzed. Nuclear DNA content was estimated by flow cytometry. Very small intraspecific variation in DNA amount was found between lines of Triticeae diploid species collected from different habitats or between different morphs. In contrast to the constancy in nuclear DNA amount at the intraspecific level, there are significant differences in genome size between the various diploid species. Within the genus Aegilops , the 1C DNA amount ranged from 4.84 pg in A. caudata to 7.52 pg in A. sharonensis; among genera, the 1C DNA amount ranged from 4.18 pg in Heteranthelium piliferum to 9.45 pg in Secale montanum . No evidence was found for a smaller genome size in annual, self-pollinating species relative to perennial, cross-pollinating ones. Diploids that grow in the southern part of the group’s distribution have larger genomes than those growing in other parts of the distribution. The contrast between the low variation at the intraspecific level and the high variation at the interspecific one suggests that changes in genome size originated in close temporal proximity to the speciation event, i.e., before, during, or immediately after it. The possible effects of sudden changes in genome size on speciation processes are discussed.


Genome ◽  
2018 ◽  
Vol 61 (8) ◽  
pp. 567-574 ◽  
Author(s):  
Wen Zhou ◽  
Bin Li ◽  
Lin Li ◽  
Wen Ma ◽  
Yuanchu Liu ◽  
...  

Dioscorea zingiberensis (Dioscoreceae) is the main plant source of diosgenin (steroidal sapogenins), the precursor for the production of steroid hormones in the pharmaceutical industry. Despite its large economic value, genomic information of the genus Dioscorea is currently unavailable. Here, we present an initial survey of the D. zingiberensis genome performed by next-generation sequencing technology together with a genome size investigation inferred by flow cytometry. The whole genome survey of D. zingiberensis generated 31.48 Gb of sequence data with approximately 78.70× coverage. The estimated genome size is 800 Mb, with a high level of heterozygosity based on K-mer analysis. These reads were assembled into 334 288 contigs with a N50 length of 1079 bp, which were further assembled into 92 163 scaffolds with a total length of 173.46 Mb. A total of 4935 genes, 81 tRNAs, 69 rRNAs, and 661 miRNAs were predicted by the genome analysis, and 263 484 repeated sequences were obtained with 419 372 simple sequence repeats (SSRs). Among these SSRs, the mononucleotide repeat type was the most abundant (up to 54.60% of the total SSRs), followed by the dinucleotide (29.60%), trinucleotide (11.37%), tetranucleotide (3.53%), pentanucleotide (0.65%), and hexanucleotide (0.25%) repeat types. The 1C-value of D. zingiberensis was calibrated against Salvia miltiorrhiza and calculated as 0.87 pg (851 Mb) by flow cytometry, which was very close to the result of the genome survey. This is the first report of genome-wide characterization within this taxon.


Genome ◽  
2008 ◽  
Vol 51 (8) ◽  
pp. 616-627 ◽  
Author(s):  
T. Eilam ◽  
Y. Anikster ◽  
E. Millet ◽  
J. Manisterski ◽  
M. Feldman

Recent molecular studies in the genera Aegilops and Triticum showed that allopolyploidization (interspecific or intergeneric hybridization followed by chromosome doubling) generated rapid elimination of low-copy or high-copy, non-coding and coding DNA sequences. The aims of this work were to determine the amount of nuclear DNA in allopolyploid species of the group and to see to what extent elimination of DNA sequences affected genome size. Nuclear DNA amount was determined by the flow cytometry method in 27 natural allopolyploid species (most of which were represented by several lines and each line by several plants) as well as 14 newly synthesized allopolyploids (each represented by several plants) and their parental plants. Very small intraspecific variation in DNA amount was found between lines of allopolyploid species collected from different habitats or between wild and domesticated forms of allopolyploid wheat. In contrast to the constancy in nuclear DNA amount at the intraspecific level, there are significant differences in genome size between the various allopolyploid species, at both the tetraploid and hexaploid levels. In most allopolyploids nuclear DNA amount was significantly less than the sum of DNA amounts of the parental species. Newly synthesized allopolyploids exhibited a similar decrease in nuclear DNA amount in the first generation, indicating that genome downsizing occurs during and (or) immediately after the formation of the allopolyploids and that there are no further changes in genome size during the life of the allopolyploids. Phylogenetic considerations of the origin of the B genome of allopolyploid wheat, based on nuclear DNA amount, are discussed.


Genome ◽  
2009 ◽  
Vol 52 (3) ◽  
pp. 275-285 ◽  
Author(s):  
T. Eilam ◽  
Y. Anikster ◽  
E. Millet ◽  
J. Manisterski ◽  
M. Feldman

Nuclear DNA amount (1C) was determined by flow cytometry in the autotetraploid cytotype of Hordeum bulbosum , in the cytologically diploidized autotetraploid cytotypes of Elymus elongatus , Hordeum murinum subsp. murinum and Hordeum murinum subsp. leporinum, in Hordeum marinum subsp. gussoneanum, in their progenitor diploid cytotypes, and in a newly synthesized autotetraploid line of E. elongatus. Several lines collected from different regions of the distribution area of every taxon, each represented by a number of plants, were analyzed in each taxon. The intracytotype variation in nuclear DNA amount of every diploid and autotetraploid cytotype was very small, indicating that no significant changes have occurred in DNA amount either after speciation or after autopolyploid formation. The autotetraploid cytotypes of H. bulbosum and the cytologically diploidized H. marinum subsp. gussoneanum had the expected additive amount of their diploid cytotypes. On the other hand, the cytologically diploidized autotetraploid cytotypes of E. elongatus and H. murinum subsp. murinum and H. murinum subsp. leporinum had considerably less nuclear DNA (10%–23%) than the expected additive value. Also, the newly synthesized autotetraploid line of E. elongatus showed similar reduction in DNA as its natural counterpart, indicating that the reduction in genome size occurred in the natural cytotype during autopolyploidization. It is suggested that the diploid-like meiotic behavior of these cytologically dipolidized autotetraploids is caused by the instantaneous elimination of a large number of DNA sequences, different sequences from different homologous pairs, leading to differentiation of the constituent genomes. The eliminated sequences are likely to include those that participate in homologous recognition and initiation of meiotic pairing. A gene system determining exclusive bivalent pairing by utilizing the differentiation between the two groups of homologues has been presumably superimposed on the DNA reduction process.


2010 ◽  
Vol 2010 ◽  
pp. 1-12 ◽  
Author(s):  
T. Eilam ◽  
Y. Anikster ◽  
E. Millet ◽  
J. Manisterski ◽  
M. Feldman

Nuclear DNA amount, determined by the flow cytometry method, in diploids, natural and synthetic allopolyploids, and natural and synthetic autopolyploids of the tribe Triticeae (Poaceae) is reviewed here and discussed. In contrast to the very small and nonsignificant variation in nuclear DNA amount that was found at the intraspecific level, the variation at the interspecific level is very large. Evidently changes in genome size are either the cause or the result of speciation. Typical autopolyploids had the expected additive DNA amount of their diploid parents, whereas natural and synthetic cytologically diploidized autopolyploids and natural and synthetic allopolyploids had significantly less DNA than the sum of their parents. Thus, genome downsizing, occurring during or immediately after the formation of these polyploids, provides the physical basis for their cytological diploidization, that is, diploid-like meiotic behavior. Possible mechanisms that are involved in genome downsizing and the biological significance of this phenomenon are discussed.


2020 ◽  
Vol 2 (3) ◽  
Author(s):  
Phuc-Loi Luu ◽  
Phuc-Thinh Ong ◽  
Thanh-Phuoc Dinh ◽  
Susan J Clark

Abstract As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the new assembly or by converting the coordinates of the data between assemblies using a mapping file, an approach referred to as ‘liftover’. Compared to re-alignment approaches, liftover is a more rapid and cost-effective solution. Here, we benchmark six liftover tools commonly used for conversion between genome assemblies by coordinates, including UCSC liftOver, rtracklayer::liftOver, CrossMap, NCBI Remap, flo and segment_liftover to determine how they performed for whole genome bisulphite sequencing (WGBS) and ChIP-seq data. Our results show high correlation between the six tools for conversion of 43 WGBS paired samples. For the chromatin sequencing data we found from interval conversion of 366 ChIP-Seq datasets, segment_liftover generates more reliable results than USCS liftOver. However, we found some regions do not always remain the same after liftover. To further increase the accuracy of liftover and avoid misleading results, we developed a three-step guideline that removes aberrant regions to ensure more robust genome conversion between reference assemblies.


Genome ◽  
2005 ◽  
Vol 48 (3) ◽  
pp. 511-520 ◽  
Author(s):  
A Ricroch ◽  
R Yockteng ◽  
S C Brown ◽  
S Nadot

Allium L. (Alliaceae), a genus of major economic importance, exhibits a great diversity in various morphological characters and particularly in life form, with bulbs and rhizomes. Allium species show variation in several cytogenetic characters such as basic chromosome number, ploidy level, and genome size. The purpose of the present investigation was to study the evolution of nuclear DNA amount, GC content, and life form. A phylogenetic approach was used on a sample of 30 Allium species, including major vegetable crops and their wild allies, belonging to the 3 major subgenera Allium, Amerallium, and Rhizirideum and 14 sections. A phylogeny was constructed using internal transcribed spacer (ITS) sequences of 43 accessions representing 30 species, and the nuclear DNA amount and the GC content of 24 Allium species were investigated by flow cytometry. For the first time, the nuclear DNA content of Allium cyaneum and Allium vavilovii was measured, and the GC content of 16 species was measured. We addressed the following questions: (i) Is the variation in nuclear DNA amount and GC content linked to the evolutionary history of these edible Allium species and their wild relatives? (ii) How did life form (rhizome or bulb) evolve in edible Allium? Our results revealed significant interspecific variation in the nuclear DNA amount as well as in the GC content. No correlation was found between the GC content and the nuclear DNA amount. The reconstruction of nuclear DNA amount on the phylogeny showed a tendency towards a decrease in genome size within the genus. The reconstruction of life form history showed that rhizomes evolved in the subgenus Rhizirideum from an ancestral bulbous life form and were subsequently lost at least twice independently in this subgenus.Key words: Allium, nuclear DNA amount, GC content, flow cytometry, internal transcribed spacer (ITS), phylogeny, life form.


Sign in / Sign up

Export Citation Format

Share Document