scholarly journals Genome Survey Sequencing of an Iconic ‘Trophy’ Sportfish, the Roosterfish Nematistius pectoralis: Genome Size, Repetitive Elements, Nuclear RNA Gene Operon, and Microsatellite Discovery

Genes ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 1710
Author(s):  
J. Antonio Baeza ◽  
José Luis Molina-Quirós ◽  
Sebastián Hernández-Muñoz

The ‘Pez Gallo’ or the Roosterfish, Nematistius pectoralis, is an ecologically relevant species in the shallow water soft-bottom environments and a target of a most lucrative recreational sport fishery in the Central Eastern Pacific Ocean. According to the International Union for Conservation of Nature, N. pectoralis is assessed globally as Data Deficient. Using low-coverage short Illumina 300 bp pair-end reads sequencing, this study reports, for the first time, the genome size, single/low-copy genome content, and nuclear repetitive elements, including the 45S rRNA DNA operon and microsatellites, in N. pectoralis. The haploid genome size estimated using a k-mer approach was 816.04 Mbp, which is within the range previously reported for other representatives of the Carangiformes order. Single/low-copy genome content (63%) was relatively high. A large portion of repetitive sequences could not be assigned to the known repeat element families. Considering only annotated repetitive elements, the most common were classified as Satellite DNA which were considerably more abundant than Class I-Long Interspersed Nuclear Elements and Class I-LTR Retroviral elements. The nuclear ribosomal operon in N. pectoralis consists of, in the following order: a 5′ ETS (length = 948 bp), ssrDNA (1835 bp), ITS1 (724 bp), a 5.8S rDNA (158 bp), ITS2 (508 bp), lsrDNA (3924 bp), and a 3′ ETS (32 bp). A total of 44 SSRs were identified. These newly developed genomic resources are most relevant for improving the understanding of biology, developing conservation plans, and managing the fishery of the iconic N. pectoralis.

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10554
Author(s):  
J. Antonio Baeza

Background Panulirus argus is an ecologically relevant species in shallow water hard-bottom environments and coral reefs and target of the most lucrative fishery in the greater Caribbean region. Methods This study reports, for the first time, the genome size and nuclear repetitive elements, including the 45S ribosomal DNA operon, 5S unit, and microsatellites, of P. argus. Results Using a k-mer approach, the average haploid genome size estimated for P. argus was 2.17 Gbp. Repetitive elements comprised 69.02% of the nuclear genome. In turn, 30.98% of the genome represented low- or single-copy sequences. A considerable proportion of repetitive sequences could not be assigned to known repeat element families. Taking into account only annotated repetitive elements, the most frequent belonged to Class I-LINE which were noticeably more abundant than Class I-LTR-Ty- 3/Gypsy, Class I-LTR-Penelope, and Class I-LTR-Ty-3/Bel-Pao elements. Satellite DNA was also abundant. The ribosomal operon in P. argus comprises, in the following order, a 5′ ETS (length = 707 bp), ssrDNA (1,875 bp), ITS1 (736 bp), 5.8S rDNA (162 bp), ITS2 (1,314 bp), lsrDNA (5,387 bp), and 3′ ETS (287 bp). A total of 1,281 SSRs were identified.


2020 ◽  
Author(s):  
Aleksandra Beric ◽  
Makenzie E Mabry ◽  
Alex E Harkess ◽  
M. Eric Schranz ◽  
Gavin C Conant ◽  
...  

Genome size of plants has long piqued the interest of researchers due to the vast differences among organisms. However, the mechanisms that drive size differences have yet to be fully understood. Two important contributing factors to genome size are expansions of repetitive elements, such as transposable elements (TEs), and whole-genome duplications (WGD). Although studies have found correlations between genome size and both TE abundance and polyploidy, these studies typically test for these patterns within a genus or species. The plant order Brassicales provides an excellent system to test if genome size evolution patterns are consistent across larger time scales, as there are numerous WGDs. This order is also home to one of the smallest plant genomes, Arabidopsis thaliana - chosen as the model plant system for this reason - as well as to species with very large genomes. With new methods that allow for TE characterization from low-coverage genome shotgun data and 71 taxa across the Brassicales, we find no correlation between genome size and TE content, and more surprisingly we identify no significant changes to TE landscape following WGD.


Author(s):  
Aleksandra Beric ◽  
Makenzie E Mabry ◽  
Alex E Harkess ◽  
Julia Brose ◽  
M Eric Schranz ◽  
...  

Abstract Genome sizes of plants have long piqued the interest of researchers due to the vast differences among organisms. However, the mechanisms that drive size differences have yet to be fully understood. Two important contributing factors to genome size are expansions of repetitive elements, such as transposable elements (TEs), and whole-genome duplications (WGD). Although studies have found correlations between genome size and both TE abundance and polyploidy, these studies typically test for these patterns within a genus or species. The plant order Brassicales provides an excellent system to further test if genome size evolution patterns are consistent across larger time scales, as there are numerous WGDs. This order is also home to one of the smallest plant genomes, Arabidopsis thaliana—chosen as the model plant system for this reason—as well as to species with very large genomes. With new methods that allow for TE characterization from low-coverage genome shotgun data and 71 taxa across the Brassicales, we confirm correlation between genome size and TE content, however, we are unable to reconstruct phylogenetic relationships and do not detect any shift in TE abundance associated with WGD.


Forests ◽  
2019 ◽  
Vol 10 (10) ◽  
pp. 826 ◽  
Author(s):  
Sui Wang ◽  
Su Chen ◽  
Caixia Liu ◽  
Yi Liu ◽  
Xiyang Zhao ◽  
...  

Research Highlights: A rigorous genome survey helped us to estimate the genomic characteristics, remove the DNA contamination, and determine the sequencing scheme of Betula platyphylla. Background and Objectives: B. platyphylla is a common tree species in northern China that has high economic and medicinal value. However, there is a lack of complete genomic information for this species, which severely constrains the progress of relevant research. The objective of this study was to survey the genome of B. platyphylla and determine the large-scale sequencing scheme of this species. Materials and Methods: Next-generation sequencing was used to survey the genome. The genome size, heterozygosity rate, and repetitive sequences were estimated by k-mer analysis. After preliminary genome assembly, sequence contamination was identified and filtered by sequence alignment. Finally, we obtained sterilized plantlets of B. platyphylla by plant tissue culture, which can be used for third-generation sequencing. Results: We estimated the genome size to be 432.9 Mb and the heterozygosity rate to be 1.22%, with repetitive sequences accounting for 62.2%. Bacterial contamination was observed in the leaves taken from the field, and most of the contaminants may be from the genus Mycobacterium. A total of 249,784 simple sequence repeat (SSR) loci were also identified in the B. platyphylla genome. Among the SSRs, only 11,326 can be used as candidates to distinguish the three Betula species. Conclusions: The B. platyphylla genome is complex and highly heterozygous and repetitive. Higher-depth third-generation sequencing may yield better assembly results. Sterilized plantlets can be used for sequencing to avoid contamination.


2020 ◽  
Vol 12 (7) ◽  
pp. 1180-1193
Author(s):  
Abhijeet Shah ◽  
Joseph I Hoffman ◽  
Holger Schielzeth

Abstract Eukaryotic organisms vary widely in genome size and much of this variation can be explained by differences in the abundance of repetitive elements. However, the phylogenetic distributions and turnover rates of repetitive elements are largely unknown, particularly for species with large genomes. We therefore used de novo repeat identification based on low coverage whole-genome sequencing to characterize the repeatomes of six species of gomphocerine grasshoppers, an insect clade characterized by unusually large and variable genome sizes. Genome sizes of the six species ranged from 8.4 to 14.0 pg DNA per haploid genome and thus include the second largest insect genome documented so far (with the largest being another acridid grasshopper). Estimated repeat content ranged from 79% to 96% and was strongly correlated with genome size. Averaged over species, these grasshopper repeatomes comprised significant amounts of DNA transposons (24%), LINE elements (21%), helitrons (13%), LTR retrotransposons (12%), and satellite DNA (8.5%). The contribution of satellite DNA was particularly variable (ranging from <1% to 33%) as was the contribution of helitrons (ranging from 7% to 20%). The age distribution of divergence within clusters was unimodal with peaks ∼4–6%. The phylogenetic distribution of repetitive elements was suggestive of an expansion of satellite DNA in the lineages leading to the two species with the largest genomes. Although speculative at this stage, we suggest that the expansion of satellite DNA could be secondary and might possibly have been favored by selection as a means of stabilizing greatly expanded genomes.


Genome ◽  
2013 ◽  
Vol 56 (9) ◽  
pp. 487-494 ◽  
Author(s):  
Kate L. Hertweck

The research field of comparative genomics is moving from a focus on genes to a more holistic view including the repetitive complement. This study aimed to characterize relative proportions of the repetitive fraction of large, complex genomes in a nonmodel system. The monocotyledonous plant order Asparagales (onion, asparagus, agave) comprises some of the largest angiosperm genomes and represents variation in both genome size and structure (karyotype). Anonymous, low coverage, single-end Illumina data from 11 exemplar Asparagales taxa were assembled using a de novo method. Resulting contigs were annotated using a reference library of available monocot repetitive sequences. Mapping reads to contigs provided rough estimates of relative proportions of each type of transposon in the nuclear genome. The results were parsed into general repeat types and synthesized with genome size estimates and a phylogenetic context to describe the pattern of transposable element evolution among these lineages. The major finding is that although some lineages in Asparagales exhibit conservation in repeat proportions, there is generally wide variation in types and frequency of repeats. This approach is an appropriate first step in characterizing repeats in evolutionary lineages with a paucity of genomic resources.


2020 ◽  
Author(s):  
Jing Li ◽  
Meiqi Lv ◽  
Lei Du ◽  
A Yunga ◽  
Shijie Hao ◽  
...  

AbstractThe monocot family Melanthiaceae with varying genome sizes in a range of 230-fold is an ideal model to study the genome size fluctuation in plants. Its family member Paris genus demonstrates an evolutionary trend of bearing huge genomes characterized by an average c-value of 49.22 pg. Here, we report a 70.18 Gb genome assembly out of the 82.55 Gb genome of Paris polyphylla var. yunnanensis (PPY), which represents the biggest sequenced genome to date. We annotate 69.53% repetitive sequences in this genome and 62.50% of which are long-terminal repeat (LTR) transposable elements. Further evolution analysis indicates that the giant genome likely results from the joint effect of common and species-specific expansion of different LTR superfamilies, which might contribute to the environment adaptation after speciation. Moreover, we identify the candidate pathway genes for the biogenesis of polyphyllins, the PPY-specific medicinal saponins, by complementary approaches including genome mining, comprehensive analysis of 31 next-generation RNA-seq data and 55.23 Gb single-molecule circular consensus sequencing (CCS) RNA-seq reads, and correlation of the transcriptome and phytochemical data of five different tissues at four growth stages. This study not only provides significant insights into plant genome size evolution, but also paves the way for the following polyphyllin synthetic biology.


2019 ◽  
Vol 39 (6) ◽  
Author(s):  
Guo-qi Li ◽  
Li-xiao Song ◽  
Chang-qing Jin ◽  
Miao Li ◽  
Shi-pei Gong ◽  
...  

AbstractApocynum venetum is an eco-economic plant that exhibits high stress resistance. In the present paper, we carried out a whole-genome survey of A. venetum in order to provide a foundation for its whole-genome sequencing. High-throughput sequencing technology (Illumina NovaSep) was first used to measure the genome size of A. venetum, and bioinformatics methods were employed for the evaluation of the genome size, heterozygosity ratio, repeated sequences, and GC content in order to provide a foundation for subsequent whole-genome sequencing. The sequencing analysis results indicated that the preliminary estimated genome size of A. venetum was 254.40 Mbp, and its heterozygosity ratio and percentage of repeated sequences were 0.63 and 40.87%, respectively, indicating that it has a complex genome. We used k-mer = 41 to carry out a preliminary assembly and obtained contig N50, which was 3841 bp with a total length of 223949699 bp. We carried out further assembly to obtain scaffold N50, which was 6196 bp with a total length of 227322054 bp. We performed simple sequence repeat (SSR) molecular marker prediction based on the A. venetum genome data and identified a total of 101918 SSRs. The differences between the different types of nucleotide repeats were large, with mononucleotide repeats being most numerous and hexanucleotide repeats being least numerous. We recommend the use of the ‘2+3’ (Illumina+PacBio) sequencing combination to supplement the Hi-C technique and resequencing technique in future whole-genome research in A. venetum.


2021 ◽  
Author(s):  
Ava Louise Haley ◽  
Rachel Lockridge Mueller

ABSTRACTTransposable elements (TEs) are repetitive sequences of DNA that replicate and proliferate throughout genomes. Taken together, all the TEs in a genome form a diverse community of sequences, which can be studied to draw conclusions about genome evolution. TE diversity can be measured using models for ecological community diversity that consider species richness and evenness. Several models predict TE diversity decreasing as genomes expand because of selection against ectopic recombination and/or competition among TEs to garner host replicative machinery and evade host silencing mechanisms. Salamanders have some of the largest vertebrate genomes and highest TE loads. Salamanders of the genus Plethodon, in particular, have genomes that range in size from 20 to 70 Gb. Here, we use Oxford Nanopore sequencing to generate low-coverage genomic sequences for four species of Plethodon that encompass two independent genome expansion events, one in the eastern clade (P. cinereus, 29.3 Gb vs. P. glutinosus, 38.9 Gb) and one in the western clade (P. vehiculum, 46.4 Gb vs P. idahoensis, 67.0 Gb). We classified the TEs in these genomes and found ~52 TE superfamilies, accounting for 27-32% of the genomes. We calculated Simpson’s and Shannon’s diversity indices to quantify overall TE diversity. In both pairwise comparisons, the diversity index values for the smaller and larger genome were almost identical. This result indicates that, when genomes reach extremely large sizes, they maintain high levels of TE diversity at the superfamily level, in contrast to predictions made by previous studies on smaller genomes.


Sign in / Sign up

Export Citation Format

Share Document