Genome survey sequencing of Dioscorea zingiberensis

Dioscorea zingiberensis (Dioscoreceae) is the main plant source of diosgenin (steroidal sapogenins), the precursor for the production of steroid hormones in the pharmaceutical industry. Despite its large economic value, genomic information of the genus Dioscorea is currently unavailable. Here, we present an initial survey of the D. zingiberensis genome performed by next-generation sequencing technology together with a genome size investigation inferred by flow cytometry. The whole genome survey of D. zingiberensis generated 31.48 Gb of sequence data with approximately 78.70× coverage. The estimated genome size is 800 Mb, with a high level of heterozygosity based on K-mer analysis. These reads were assembled into 334 288 contigs with a N50 length of 1079 bp, which were further assembled into 92 163 scaffolds with a total length of 173.46 Mb. A total of 4935 genes, 81 tRNAs, 69 rRNAs, and 661 miRNAs were predicted by the genome analysis, and 263 484 repeated sequences were obtained with 419 372 simple sequence repeats (SSRs). Among these SSRs, the mononucleotide repeat type was the most abundant (up to 54.60% of the total SSRs), followed by the dinucleotide (29.60%), trinucleotide (11.37%), tetranucleotide (3.53%), pentanucleotide (0.65%), and hexanucleotide (0.25%) repeat types. The 1C-value of D. zingiberensis was calibrated against Salvia miltiorrhiza and calculated as 0.87 pg (851 Mb) by flow cytometry, which was very close to the result of the genome survey. This is the first report of genome-wide characterization within this taxon.

Download Full-text

Genome Survey of Male and Female Spotted Scat (Scatophagus argus)

Animals ◽

10.3390/ani9121117 ◽

2019 ◽

Vol 9 (12) ◽

pp. 1117 ◽

Cited By ~ 1

Author(s):

Yuanqing Huang ◽

Dongneng Jiang ◽

Ming Li ◽

Umar Farouk Mustapha ◽

Changxu Tian ◽

...

Keyword(s):

Sequence Data ◽

Economic Value ◽

Gc Content ◽

Dinucleotide Repeats ◽

Dominant Form ◽

Scatophagus Argus ◽

Genome Survey ◽

Male Genome ◽

A Genome ◽

High Quality Sequence

The spotted scat, Scatophagus argus, is a species of fish that is widely propagated within the Chinese aquaculture industry and therefore has significant economic value. Despite this, studies of its genome are severely lacking. In the present study, a genomic survey of S. argus was conducted using next-generation sequencing (NGS). In total, 55.699 GB (female) and 51.047 GB (male) of high-quality sequence data were obtained. Genome sizes were estimated to be 598.73 (female) and 597.60 (male) Mbp. The sequence repeat ratios were calculated to be 27.06% (female) and 26.99% (male). Heterozygosity ratios were 0.37% for females and 0.38% for males. Reads were assembled into 444,961 (female) and 453,459 (male) contigs with N50 lengths of 5,747 and 5,745 bp for females and males, respectively. The average guanine-cytosine (GC) content of the female genome was 41.78%, and 41.82% for the male. A total of 42,869 (female) and 43,283 (male) genes were annotated to the non-redundant (NR) and SwissProt databases. The female and male genomes contained 66.6% and 67.8% BUSCO core genes, respectively. Dinucleotide repeats were the dominant form of simple sequence repeats (SSR) observed in females (68.69%) and males (68.56%). Additionally, gene fragments of Dmrt1 were only observed in the male genome. This is the first report of a genome-wide characterization of S. argus.

Download Full-text

Measuring genome sizes using read-depth, k-mers, and flow cytometry: methodological comparisons in beetles (Coleoptera)

10.1101/761304 ◽

2019 ◽

Cited By ~ 2

Author(s):

James M. Pflug ◽

Valerie Renee Holmes ◽

Crystal Burrus ◽

J. Spencer Johnston ◽

David R. Maddison

Keyword(s):

Flow Cytometry ◽

Genome Size ◽

Genomic Sequence ◽

Sequence Data ◽

Repetitive Sequences ◽

Single Copy ◽

Read Depth ◽

Test System ◽

Estimation Methods ◽

Next Generation

ABSTRACTMeasuring genome size across different species can yield important insights into evolution of the genome and allow for more informed decisions when designing next-generation genomic sequencing projects. New techniques for estimating genome size using shallow genomic sequence data have emerged which have the potential to augment our knowledge of genome sizes, yet these methods have only been used in a limited number of empirical studies. In this project, we compare estimation methods using next-generation sequencing (k-mer methods and average read depth of single-copy genes) to measurements from flow cytometry, the gold standard for genome size measures, using ground beetles (Carabidae) and other members of the beetle suborder Adephaga as our test system. We also present a new protocol for using read-depth of single-copy genes to estimate genome size. Additionally, we report flow cytometry measurements for five previously unmeasured carabid species, as well as 21 new draft genomes and six new draft transcriptomes across eight species of adephagan beetles. No single sequence-based method performed well on all species, and all tended to underestimate the genome sizes, although only slightly in most samples. For one species, Bembidion haplogonum, most sequence-based methods yielded estimates half the size suggested by flow cytometry. This discrepancy for k-mer methods can be explained by a large number of repetitive sequences, but we have no explanation for why read-depth methods yielded results that were also strikingly low.

Download Full-text

Intraspecific Genome Size Variation in Pumpkin (Cucurbita pepo subsp. pepo)

HortScience ◽

10.21273/hortsci.43.3.949 ◽

2008 ◽

Vol 43 (3) ◽

pp. 949-951 ◽

Cited By ~ 2

Author(s):

A. Lane Rayburn ◽

Mosbah M. Kushad ◽

Wanisari Wannarat

Keyword(s):

Flow Cytometry ◽

Genome Size ◽

Cucurbita Pepo ◽

Fruit Size ◽

Size Variation ◽

Fruit Morphology ◽

Genome Size Variation ◽

A Genome ◽

Size Variability

Genome size has recently been reported to vary 16% in pumpkins (Cucurbita spp.). The majority of this variation can be attributed to genome size differences in pumpkins of various taxonomical classes. The purpose of this study was to determine if intraspecific genome size variability could be detected by flow cytometry in Cucurbita pepo subsp. pepo pumpkin cultivars with similar fruit morphology. The pie pumpkins group was chosen for this study because of their similar fruit size, shape, and color. Genome sizes ranged from 1.109 pg in Spooktacular to 1.064 pg in Small Sugar. Spooktacular had a genome size larger than Small Sugar in all three experiments. Therefore, intraspecific genome size variation does exist in C. pepo subsp. pepo among pumpkin cultivars of similar fruit morphology.

Download Full-text

A First Insight into a Draft Genome of Silver Sillago (Sillago sihama) via Genome Survey Sequencing

Animals ◽

10.3390/ani9100756 ◽

2019 ◽

Vol 9 (10) ◽

pp. 756 ◽

Cited By ~ 2

Author(s):

Li ◽

Tian ◽

Huang ◽

Lin ◽

Wang ◽

...

Keyword(s):

Sequence Data ◽

Economic Value ◽

Draft Genome ◽

Gc Content ◽

Dinucleotide Repeats ◽

Dominant Form ◽

Genome Survey ◽

High Quality Sequence ◽

Sillago Sihama ◽

Next Generation Sequencing Ngs

Sillago sihama has high economic value and is one of the most attractive aquaculture species in China. Despite its economic importance, studies of its genome have barely been performed. In this study, we conducted a first genomic survey of S. sihama using next-generation sequencing (NGS). In total, 45.063 Gb of high-quality sequence data were obtained. For the 17-mer frequency distribution, the genome size was estimated to be 508.50 Mb. The sequence repeat ratio was calculated to be 21.25%, and the heterozygosity ratio was 0.92%. Reads were assembled into 1,009,363 contigs, with a N50 length of 1362 bp, and then into 814,219 scaffolds, with a N50 length of 2173 bp. The average Guanine and Cytosine (GC) content was 45.04%. Dinucleotide repeats (56.55%) were the dominant form of simple sequence repeats (SSR).

Download Full-text

Genome Size, Complexity, and Ploidy of the Pathogenic Fungus Histoplasma capsulatum

Journal of Bacteriology ◽

10.1128/jb.180.24.6697-6703.1998 ◽

1998 ◽

Vol 180 (24) ◽

pp. 6697-6703 ◽

Cited By ~ 34

Author(s):

Jeanne Carr ◽

Glenmore Shearer

Keyword(s):

Flow Cytometry ◽

Genome Size ◽

Repetitive Dna ◽

Histoplasma Capsulatum ◽

Nuclear Dna ◽

Pathogenic Fungus ◽

Repetitive Sequences ◽

Dna Index ◽

A Genome ◽

Β Tubulin

ABSTRACT The genome size, complexity, and ploidy of the dimorphic pathogenic fungus Histoplasma capsulatum was determined by using DNA renaturation kinetics, genomic reconstruction, and flow cytometry. Nuclear DNA was isolated from two strains, G186AS and Downs, and analyzed by renaturation kinetics and genomic reconstruction with three putative single-copy genes (calmodulin, α-tubulin, and β-tubulin). G186AS was found to have a genome of approximately 2.3 × 107 bp with less than 0.5% repetitive sequences. The Downs strain, however, was found to have a genome approximately 40% larger with more than 16 times more repetitive DNA. The Downs genome was determined to be 3.2 × 107 bp with approximately 8% repetitive DNA. To determine ploidy, the DNA mass per cell measured by flow cytometry was compared with the 1n genome estimate to yield a DNA index (DNA per cell/1n genome size). Strain G186AS was found to have a DNA index of 0.96, and Downs had a DNA index of 0.94, indicating that both strains are haploid. Genomic reconstruction and Southern blot data obtained with α- and β-tubulin probes indicated that some genetic duplication has occurred in the Downs strain, which may be aneuploid or partially diploid.

Download Full-text

Determination of DNA content and relative 2C genome sizes of some promising commercial varieties of sugarcane using flow cytometer

Current Botany ◽

10.19071/cb.2017.v8.3207 ◽

2017 ◽

Vol 8 ◽

Author(s):

A. Mondal S.K. Ghosal ◽

T. Pal Kalyan Kumar De

Keyword(s):

Flow Cytometry ◽

Genome Size ◽

Dna Content ◽

Nuclear Dna ◽

Nuclear Dna Content ◽

Cane Yield ◽

Sugarcane Varieties ◽

Yield Information ◽

A Genome ◽

2C Dna Content

In the present study, 2C DNA content and the genome sizes (in picograms-pg and megabase pairs-Mbp respectively) of 19 promising commercial varieties of sugarcane, the derivatives of man-made interspecific hybrids between cultivated and wild species were analyzed using flow cytometry. In this work, 2C nuclear DNA content was determined. Knowing the 2C nuclear DNA content, the unknown chromosome numbers of the varieties could be predicted. Large differences (65 % variation) in DNA content (2C) of 19 varieties were detected, ranging, from 3.80 pg to 10.96 pg, which corresponds to a genome size ranging from 3724.00 Mbp to 10740.80 Mbp due to the variation of ploidy level and are considered the most complex genomes among crop plants. However, the relationship between chromosome number and genome size was highly significant (P < 0.001). In the present study, internode diameter, Sugar juice content and cane yield/ha are also positively correlated with DNA content. The estimated genome sizes would also yield information critical for sugarcane breeding and genome sequencing programs. Keywords: Genome size, Sugarcane varieties, Flow cytometry, DNA content.

Download Full-text

Quantitative and qualitative genomic characterization of cultivated Ilex L. species

Plant Genetic Resources ◽

10.1017/s1479262114000756 ◽

2014 ◽

Vol 13 (2) ◽

pp. 142-152 ◽

Cited By ~ 4

Author(s):

Alexandra Marina Gottlieb ◽

Lidia Poggio

Keyword(s):

Genome Size ◽

Repetitive Dna ◽

Dna Sequences ◽

Sequence Data ◽

Repetitive Sequences ◽

Representational Difference Analysis ◽

Ilex Paraguariensis ◽

Repetitive Dna Sequences ◽

A Genome

The development of modern approaches to the genetic improvement of the tree crops Ilex paraguariensis (‘yerba mate’) and Ilex dumosa (‘yerba señorita’) is halted by the scarcity of basic genetic information. In this study, we characterized the implementation of low-cost methodologies such as representational difference analysis (RDA), single-strand conformation polymorphisms (SSCP), and reverse and direct dot-blot filter hybridization assays coupled with thorough bioinformatic characterization of sequence data for both species. Also, we estimated the genome size of each species using flow cytometry. This study contributes to the better understanding of the genetic differences between two cultivated species, by generating new quantitative and qualitative genome-level data. Using the RDA technique, we isolated a group of non-coding repetitive sequences, tentatively considered as Ilex-specific, which were 1.21- to 39.62-fold more abundant in the genome of I. paraguariensis. Another group of repetitive DNA sequences involved retrotransposons, which appeared 1.41- to 35.77-fold more abundantly in the genome of I. dumosa. The genomic DNA of each species showed different performances in filter hybridizations: while I. paraguariensis showed a high intraspecific affinity, I. dumosa exhibited a higher affinity for the genome of the former species (i.e. interspecific). These differences could be attributed to the occurrence of homologous but slightly divergent repetitive DNA sequences, highly amplified in the genome of I. paraguariensis but not in the genome of I. dumosa. Additionally, our hybridization outcomes suggest that the genomes of both species have less than 80% similarity. Moreover, for the first time, we report herein a genome size estimate of 1670 Mbp for I. paraguariensis and that of 1848 Mbp for I. dumosa.

Download Full-text

Measuring Genome Sizes Using Read-Depth, k-mers, and Flow Cytometry: Methodological Comparisons in Beetles (Coleoptera)

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401028 ◽

2020 ◽

Vol 10 (9) ◽

pp. 3047-3060 ◽

Cited By ~ 3

Author(s):

James M. Pflug ◽

Valerie Renee Holmes ◽

Crystal Burrus ◽

J. Spencer Johnston ◽

David R. Maddison

Keyword(s):

Flow Cytometry ◽

Genome Size ◽

Genomic Sequence ◽

Sequence Data ◽

Empirical Studies ◽

Single Copy ◽

Read Depth ◽

Test System ◽

Estimation Methods ◽

Next Generation

Measuring genome size across different species can yield important insights into evolution of the genome and allow for more informed decisions when designing next-generation genomic sequencing projects. New techniques for estimating genome size using shallow genomic sequence data have emerged which have the potential to augment our knowledge of genome sizes, yet these methods have only been used in a limited number of empirical studies. In this project, we compare estimation methods using next-generation sequencing (k-mer methods and average read depth of single-copy genes) to measurements from flow cytometry, a standard method for genome size measures, using ground beetles (Carabidae) and other members of the beetle suborder Adephaga as our test system. We also present a new protocol for using read-depth of single-copy genes to estimate genome size. Additionally, we report flow cytometry measurements for five previously unmeasured carabid species, as well as 21 new draft genomes and six new draft transcriptomes across eight species of adephagan beetles. No single sequence-based method performed well on all species, and all tended to underestimate the genome sizes, although only slightly in most samples. For one species, Bembidion sp. nr. transversale, most sequence-based methods yielded estimates half the size suggested by flow cytometry.

Download Full-text

Genome Survey of Misgurnus Anguillicaudatus to Identify Genomic Information, Simple Sequence Repeat (SSR) Markers and Mitochondrial Genome

10.21203/rs.3.rs-767195/v1 ◽

2021 ◽

Author(s):

Guiyun Huang ◽

Jianmeng Cao ◽

Chen Chen ◽

Miao Wang ◽

Zhigang Liu ◽

...

Keyword(s):

Mitochondrial Genome ◽

Microsatellite Loci ◽

Complete Mitochondrial Genome ◽

Gc Content ◽

Repeat Sequence ◽

Misgurnus Anguillicaudatus ◽

Population Genetic Analysis ◽

Next Generation Sequencing Technology ◽

Genome Survey ◽

A Genome

Abstract The dojo loach Misgurnus anguillicaudatus is an important economic species in Asia because of its nutritional value and broad environmental adaptability. Despite its economic importance, genomic data from M. anguillicaudatus was unavailable. In the present study, we conducted a genome survey of M. anguillicaudatus using next-generation sequencing technology. Its genome size was estimated to be 1105.97 Mb by using K-mer analysis, and its heterozygosity ratio, repeat sequence content, GC content were 1.45%, 58.98%, and 38.03%, respectively. A total of 376,357 microsatellite motifs were identified and mononucleotides, with a frequency of 42.57%, were the most frequently repeated motifs, followed by 40.83% dinucleotide, 7.49% trinucleotide, 8.09% tetranucleotide, and 0.91% pentanucleotide motifs. The AC/GT, AAT/ATT, and ACAG/CTGT repeats were the most abundant motifs among dinucleotide, trinucleotide, and tetranucleotide motifs, respectively. Besides, a complete mitochondrial genome was sequenced. Based on the maximum likelihood and Bayesian inference analyses, M. anguillicaudatus in this study was the “introgressed” mitochondrial type. Furthermore, a total of 376,357 SSR motifs were detected from the genome survey assembly. Seventy microsatellite loci were randomly selected from these SSR loci to test polymorphic, of which, twenty microsatellite loci were assessed in 30 individuals from a wild population. The number of alleles (Na), observed heterozygosity (Ho), and expected heterozygosity (He) per locus ranged from 7 to 19, 0.400 to 0.933, and 0.752 to 0.938, respectively. All twenty loci were highly informative (PIC > 0.700). Eight loci deviated from Hardy–Weinberg equilibrium after Bonferroni correction (P < 0.05). This is the first report of a genome survey in M. anguillicaudatus, and genome information, mitochondrial genome, and microsatellite markers will be valuable for further studies on population genetic analysis, natural resource conservation, and molecular marker-assisted selective breeding.

Download Full-text

One Major Challenge of Sequencing Large Plant Genomes Is to Know How Big They Really Are

International Journal of Molecular Sciences ◽

10.3390/ijms19113554 ◽

2018 ◽

Vol 19 (11) ◽

pp. 3554 ◽

Cited By ~ 5

Author(s):

Jaroslav Doležel ◽

Jana Čížková ◽

Hana Šimková ◽

Jan Bartoš

Keyword(s):

Genome Size ◽

Reference Genome ◽

Nuclear Dna ◽

Sequence Data ◽

Cereal Crops ◽

Dna Amount ◽

A Genome ◽

Human Genome Assembly ◽

Genome Assemblies ◽

Measured Mass

Any project seeking to deliver a plant or animal reference genome sequence must address the question as to the completeness of the assembly. Given the complexity introduced particularly by the presence of sequence redundancy, a problem which is especially acute in polyploid genomes, this question is not an easy one to answer. One approach is to use the sequence data, along with the appropriate computational tools, the other is to compare the estimate of genome size with an experimentally measured mass of nuclear DNA. The latter requires a reference standard in order to provide a robust relationship between the two independent measurements of genome size. Here, the proposal is to choose the human male leucocyte genome for this standard: its 1C DNA amount (the amount of DNA contained within unreplicated haploid chromosome set) of 3.50 pg is equivalent to a genome length of 3.423 Gbp, a size which is just 5% longer than predicted by the most current human genome assembly. Adopting this standard, this paper assesses the completeness of the reference genome assemblies of the leading cereal crops species wheat, barley and rye.

Download Full-text