Assembly and comparative analysis of transposable elements from low coverage genomic sequence data in Asparagales

Genome ◽  
2013 ◽  
Vol 56 (9) ◽  
pp. 487-494 ◽  
Author(s):  
Kate L. Hertweck

The research field of comparative genomics is moving from a focus on genes to a more holistic view including the repetitive complement. This study aimed to characterize relative proportions of the repetitive fraction of large, complex genomes in a nonmodel system. The monocotyledonous plant order Asparagales (onion, asparagus, agave) comprises some of the largest angiosperm genomes and represents variation in both genome size and structure (karyotype). Anonymous, low coverage, single-end Illumina data from 11 exemplar Asparagales taxa were assembled using a de novo method. Resulting contigs were annotated using a reference library of available monocot repetitive sequences. Mapping reads to contigs provided rough estimates of relative proportions of each type of transposon in the nuclear genome. The results were parsed into general repeat types and synthesized with genome size estimates and a phylogenetic context to describe the pattern of transposable element evolution among these lineages. The major finding is that although some lineages in Asparagales exhibit conservation in repeat proportions, there is generally wide variation in types and frequency of repeats. This approach is an appropriate first step in characterizing repeats in evolutionary lineages with a paucity of genomic resources.

2019 ◽  
Author(s):  
James M. Pflug ◽  
Valerie Renee Holmes ◽  
Crystal Burrus ◽  
J. Spencer Johnston ◽  
David R. Maddison

ABSTRACTMeasuring genome size across different species can yield important insights into evolution of the genome and allow for more informed decisions when designing next-generation genomic sequencing projects. New techniques for estimating genome size using shallow genomic sequence data have emerged which have the potential to augment our knowledge of genome sizes, yet these methods have only been used in a limited number of empirical studies. In this project, we compare estimation methods using next-generation sequencing (k-mer methods and average read depth of single-copy genes) to measurements from flow cytometry, the gold standard for genome size measures, using ground beetles (Carabidae) and other members of the beetle suborder Adephaga as our test system. We also present a new protocol for using read-depth of single-copy genes to estimate genome size. Additionally, we report flow cytometry measurements for five previously unmeasured carabid species, as well as 21 new draft genomes and six new draft transcriptomes across eight species of adephagan beetles. No single sequence-based method performed well on all species, and all tended to underestimate the genome sizes, although only slightly in most samples. For one species, Bembidion haplogonum, most sequence-based methods yielded estimates half the size suggested by flow cytometry. This discrepancy for k-mer methods can be explained by a large number of repetitive sequences, but we have no explanation for why read-depth methods yielded results that were also strikingly low.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3982 ◽  
Author(s):  
RuiJuan Feng ◽  
Xin Wang ◽  
Min Tao ◽  
Guanchao Du ◽  
Qishuo Wang

Vallisneria spinulosa is a freshwater aquatic plant of ecological and economic importance. However, there is limited cytogenetic and genomics information on Vallisneria. In this study, we measured the nuclear DNA content of Vallisneria spinulosa by flow cytometry, performed a de novo assembly, and annotated repetitive sequences by using a combination of next-generation sequencing (NGS) and bioinformatics tools. The genome size of Vallisneria spinulosa is approximately 3,595 Mbp, in which nearly 60% of the genome consists of repetitive sequences. The majority of the repetitive sequences are LTR-retrotransposons comprising 43% of the genome. Although the amount of sequencing data used in this study was not sufficient for a whole-genome assembly, it could generate an overview of representative elements in the genome. These results will lay a new foundation for further studies on various species that belong to the Vallisneria genus.


1999 ◽  
Vol 45 (7) ◽  
pp. 565-572 ◽  
Author(s):  
Todd Christian ◽  
Diana M Downs

As genomic sequence data become more prevalent, the challenges in microbial physiology shift from identifying biochemical pathways to understanding the interactions that occur between them to create a robust but responsive metabolism. One of the most powerful methods to identify such interactions is in vivo phenotypic analysis. We have utilized thiamine synthesis as a model to detect subtle metabolic interactions due to the sensitivity allowed by the small cellular requirement for this vitamin. Although purine biosynthesis produces an intermediate in thiamine synthesis, mutants blocked in the first step of de novo purine biosynthesis (PurF) are able to grow in the absence of thiamine owing to an alternative synthesis. A number of general metabolic defects have been found to prevent PurF-independent thiamine synthesis. Here we report stimulation of thiamine-independent growth caused by a mutation in one or both genes encoding the pyruvate kinase isozymes. The results presented herein represent the first phenotype described for mutants defective in pykA or pykF, and thus identify metabolic interactions that exist in vivo.Key words: thiamine synthesis, metabolic integration.


2008 ◽  
Vol 19 (2) ◽  
pp. 294-305 ◽  
Author(s):  
J. A. Reinhardt ◽  
D. A. Baltrus ◽  
M. T. Nishimura ◽  
W. R. Jeck ◽  
C. D. Jones ◽  
...  

Genes ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 1710
Author(s):  
J. Antonio Baeza ◽  
José Luis Molina-Quirós ◽  
Sebastián Hernández-Muñoz

The ‘Pez Gallo’ or the Roosterfish, Nematistius pectoralis, is an ecologically relevant species in the shallow water soft-bottom environments and a target of a most lucrative recreational sport fishery in the Central Eastern Pacific Ocean. According to the International Union for Conservation of Nature, N. pectoralis is assessed globally as Data Deficient. Using low-coverage short Illumina 300 bp pair-end reads sequencing, this study reports, for the first time, the genome size, single/low-copy genome content, and nuclear repetitive elements, including the 45S rRNA DNA operon and microsatellites, in N. pectoralis. The haploid genome size estimated using a k-mer approach was 816.04 Mbp, which is within the range previously reported for other representatives of the Carangiformes order. Single/low-copy genome content (63%) was relatively high. A large portion of repetitive sequences could not be assigned to the known repeat element families. Considering only annotated repetitive elements, the most common were classified as Satellite DNA which were considerably more abundant than Class I-Long Interspersed Nuclear Elements and Class I-LTR Retroviral elements. The nuclear ribosomal operon in N. pectoralis consists of, in the following order: a 5′ ETS (length = 948 bp), ssrDNA (1835 bp), ITS1 (724 bp), a 5.8S rDNA (158 bp), ITS2 (508 bp), lsrDNA (3924 bp), and a 3′ ETS (32 bp). A total of 44 SSRs were identified. These newly developed genomic resources are most relevant for improving the understanding of biology, developing conservation plans, and managing the fishery of the iconic N. pectoralis.


2017 ◽  
Author(s):  
Teri Evans ◽  
Andrew Johnson ◽  
Matt Loose

AbstractLarge repeat rich genomes present challenges for assembly and identification of gene models with short read technologies. Here we present a method we call Virtual Genome Walking which uses an iterative assembly approach to first identify exons from de-novo assembled transcripts and assemble whole genome reads against each exon. This process is iterated allowing the extension of exons. These linked assemblies are refined to generate gene models including upstream and downstream genomic sequence as well as intronic sequence. We test this method using a 20X genomic read set for the axolotl, the genome of which is estimated to be 30 Gb in size. These reads were previously reported to be effectively impossible to assemble. Here we provide almost 1 Gb of assembled sequence describing over 19,000 gene models for the axolotl. Gene models stop assembling either due to localised low coverage in the genomic reads, or the presence of repeats. We validate our observations by comparison with previously published axolotl bacterial artificial chromosome (BAC) sequences. In addition we analysed axolotl intron length, intron-exon structure, repeat content and synteny. These gene-models, sequences and annotations are freely available for download from https://tinyurl.com/y8gydc6n. The software pipeline including a docker image is available from https://github.com/LooseLab/iterassemble. These methods will increase the value of low coverage sequencing of understudied model systems.


2021 ◽  
Author(s):  
Jacqueline Heckenhauer ◽  
Paul B. Frandsen ◽  
John S. Sproul ◽  
Zheng Li ◽  
Juraj Paule ◽  
...  

Genome size can vary widely over relatively short evolutionary time scales and is implicated in form, function and ecological success of a species. Here, we generated 17 new de novo whole genome assemblies and present a holistic view on genome size diversity of the highly diversified, non-model insect order, Trichoptera (caddisflies). We detect large variation in genome size and find strong evidence that transposable element (TE) expansions are the primary driver of genome size evolution: TE expansions contribute to larger genomes in clades with higher ecological diversity and have a major impact on protein-coding gene regions. These TE-gene associations show a linear relationship with increasing genome size. Our findings suggest new hypotheses for future testing, especially the effects of TE activity and TE-gene associations on genome stability, gene expression, phenotypes, and their potential adaptive advantages in groups with high species, ecological, and functional diversities.


2014 ◽  
Vol 13 (2) ◽  
pp. 142-152 ◽  
Author(s):  
Alexandra Marina Gottlieb ◽  
Lidia Poggio

The development of modern approaches to the genetic improvement of the tree crops Ilex paraguariensis (‘yerba mate’) and Ilex dumosa (‘yerba señorita’) is halted by the scarcity of basic genetic information. In this study, we characterized the implementation of low-cost methodologies such as representational difference analysis (RDA), single-strand conformation polymorphisms (SSCP), and reverse and direct dot-blot filter hybridization assays coupled with thorough bioinformatic characterization of sequence data for both species. Also, we estimated the genome size of each species using flow cytometry. This study contributes to the better understanding of the genetic differences between two cultivated species, by generating new quantitative and qualitative genome-level data. Using the RDA technique, we isolated a group of non-coding repetitive sequences, tentatively considered as Ilex-specific, which were 1.21- to 39.62-fold more abundant in the genome of I. paraguariensis. Another group of repetitive DNA sequences involved retrotransposons, which appeared 1.41- to 35.77-fold more abundantly in the genome of I. dumosa. The genomic DNA of each species showed different performances in filter hybridizations: while I. paraguariensis showed a high intraspecific affinity, I. dumosa exhibited a higher affinity for the genome of the former species (i.e. interspecific). These differences could be attributed to the occurrence of homologous but slightly divergent repetitive DNA sequences, highly amplified in the genome of I. paraguariensis but not in the genome of I. dumosa. Additionally, our hybridization outcomes suggest that the genomes of both species have less than 80% similarity. Moreover, for the first time, we report herein a genome size estimate of 1670 Mbp for I. paraguariensis and that of 1848 Mbp for I. dumosa.


2020 ◽  
Vol 12 (9) ◽  
pp. 1504-1514
Author(s):  
Corinna A Pinzari ◽  
Lin Kang ◽  
Pawel Michalak ◽  
Lars S Jermiin ◽  
Donald K Price ◽  
...  

Abstract We examine the genetic history and population status of Hawaiian hoary bats (Lasiurus semotus), the most isolated bats on Earth, and their relationship to northern hoary bats (Lasiurus cinereus), through whole-genome analysis of single-nucleotide polymorphisms mapped to a de novo-assembled reference genome. Profiles of genomic diversity and divergence indicate that Hawaiian hoary bats are distinct from northern hoary bats, and form a monophyletic group, indicating a single ancestral colonization event 1.34 Ma, followed by substantial divergence between islands beginning 0.51 Ma. Phylogenetic analysis indicates Maui is central to the radiation across the archipelago, with the southward expansion to Hawai‘i and westward to O‘ahu and Kaua‘i. Because this endangered species is of conservation concern, a clearer understanding of the population genetic structure of this bat in the Hawaiian Islands is of timely importance.


Sign in / Sign up

Export Citation Format

Share Document