scholarly journals Mosquito genomes are frequently invaded by transposable elements through horizontal transfer

PLoS Genetics ◽  
2020 ◽  
Vol 16 (11) ◽  
pp. e1008946
Author(s):  
Elverson Soares de Melo ◽  
Gabriel Luz Wallau

Transposable elements (TEs) are mobile genetic elements that parasitize basically all eukaryotic species genomes. Due to their complexity, an in-depth TE characterization is only available for a handful of model organisms. In the present study, we performed a de novo and homology-based characterization of TEs in the genomes of 24 mosquito species and investigated their mode of inheritance. More than 40% of the genome of Aedes aegypti, Aedes albopictus, and Culex quinquefasciatus is composed of TEs, while it varied substantially among Anopheles species (0.13%–19.55%). Class I TEs are the most abundant among mosquitoes and at least 24 TE superfamilies were found. Interestingly, TEs have been extensively exchanged by horizontal transfer (172 TE families of 16 different superfamilies) among mosquitoes in the last 30 million years. Horizontally transferred TEs represents around 7% of the genome in Aedes species and a small fraction in Anopheles genomes. Most of these horizontally transferred TEs are from the three ubiquitous LTR superfamilies: Gypsy, Bel-Pao and Copia. Searching more than 32,000 genomes, we also uncovered transfers between mosquitoes and two different Phyla—Cnidaria and Nematoda—and two subphyla—Chelicerata and Crustacea, identifying a vector, the worm Wuchereria bancrofti, that enabled the horizontal spread of a Tc1-mariner element among various Anopheles species. These data also allowed us to reconstruct the horizontal transfer network of this TE involving more than 40 species. In summary, our results suggest that TEs are frequently exchanged by horizontal transfers among mosquitoes, influencing mosquito's genome size and variability.

Author(s):  
Elverson Soares de Melo ◽  
Gabriel da Luz Wallau

AbstractTransposable elements (TEs) are a set of mobile elements within a genome. Due to their complexity, an in-depth TE characterization is only available for a handful of model organisms. In the present study, we performed a de novo and homology-based characterization of TEs in the genomes of 24 mosquito species and investigated their mode of inheritance. More than 40% of the genome of Aedes aegypti, Aedes albopictus, and Culex quinquefasciatus is composed of TEs, varying substantially among Anopheles species (0.13%–19.55%). Class I TEs are the most abundant among mosquitoes and at least 24 TE superfamilies were found. Interestingly, TEs have been continuously exchanged by horizontal transfer (212 TE families of 18 different superfamilies) among mosquitoes since 30 million years ago, representing around 6% of the genome in Aedes genomes and a small fraction in Anopheles genomes. Most of these horizontally transferred TEs are from the three ubiquitous LTR superfamilies: Gypsy, Bel-Pao and Copia. Searching more 32,000 genomes, we also uncover transfers between mosquitoes and two different Phyla—Cnidaria and Nematoda—and two subphyla—Chelicerata and Crustacea, identifying a vector, the worm Wuchereria bancrofti, that enabled the horizontal spread of a Tc1-mariner element of irritans subfamily among various Anopheles species. These data also allowed us to reconstruct the horizontal transfer network of this TE involving more than 40 species. In summary, our results suggest that TEs are constantly exchanged by common phenomena of horizontal transfers among mosquitoes, influencing genome variation and contributing to genome size expansion.Author SummaryMost eukaryotes have DNA fragments inside their genome that can multiply by inserting themselves in other regions of the genome, generating variability. These fragments are called Transposable Elements (TEs). Since they are a constituent part of the eukaryote genomes, these pieces of DNA are usually inherited vertically by the offspring. To avoid damage to the genome caused by the replication and insertion of TEs, organisms usually control them, leading to their inactivation. However, TEs sometimes get out of control and invade other species through a horizontal transfer mechanism. This dynamic is not known in mosquitoes, a group of organisms that acts as vectors of many human diseases. We collected mosquito genomes available in public databases and characterized the whole content of TEs. Using a statistic supported method, we investigate TE relations among mosquitoes and discover that horizontal transfers of transposons are common and occurred in the last 30 million years among these species. Although not as common as transfers among closely related species, transposon transfer to distant species also occur. We also identify a parasite, a filarial worm, that may have facilitated the transfer of TE to many mosquitoes. Together, horizontally transferred TEs contribute to increasing mosquito genome size and variation.


2019 ◽  
Author(s):  
Jullien M. Flynn ◽  
Robert Hubley ◽  
Clément Goubert ◽  
Jeb Rosen ◽  
Andrew G. Clark ◽  
...  

AbstractThe accelerating pace of genome sequencing throughout the tree of life is driving the need for improved unsupervised annotation of genome components such as transposable elements (TEs). Because the types and sequences of TEs are highly variable across species, automated TE discovery and annotation are challenging and time-consuming tasks. A critical first step is the de novo identification and accurate compilation of sequence models representing all the unique TE families dispersed in the genome. Here we introduce RepeatModeler2, a new pipeline that greatly facilitates this process. This new program brings substantial improvements over the original version of RepeatModeler, one of the most widely used tools for TE discovery. In particular, this version incorporates a module for structural discovery of complete LTR retroelements, which are widespread in eukaryotic genomes but recalcitrant to automated identification because of their size and sequence complexity. We benchmarked RepeatModeler2 on three model species with diverse TE landscapes and high-quality, manually curated TE libraries: Drosophila melanogaster (fruit fly), Danio rerio (zebrafish), and Oryza sativa (rice). In these three species, RepeatModeler2 identified approximately three times more consensus sequences matching with >95% sequence identity and sequence coverage to the manually curated sequences than the original RepeatModeler. As expected, the greatest improvement is for LTR retroelements. The program had an extremely low false positive rate when applied to simulated genomes devoid of TEs. Thus, RepeatModeler2 represents a valuable addition to the genome annotation toolkit that will enhance the identification and study of TEs in eukaryotic genome sequences. RepeatModeler2 is available as source code or a containerized package under an open license (https://github.com/Dfam-consortium/RepeatModeler, https://github.com/Dfam-consortium/TETools).SignificanceGenome sequences are being produced for more and more eukaryotic species. The bulk of these genomes is composed of parasitic, self-mobilizing transposable elements (TEs) that play important roles in organismal evolution. Thus there is a pressing need for developing software that can accurately identify the diverse set of TEs dispersed in genome sequences. Here we introduce RepeatModeler2, an easy-to-use package for the curation of reference TE libraries which can be applied to any eukaryotic species. Through several major improvements over the previous version, RepeatModeler2 is able to produce libraries that recapitulate the known composition of three model species with some of the most complex TE landscapes. Thus RepeatModeler2 will greatly enhance the discovery and annotation of TEs in genome sequences.


2018 ◽  
Author(s):  
Alba Rey-Iglesia ◽  
Shyam Gopalakrishan ◽  
Christian Carøe ◽  
David E. Alquezar-Planas ◽  
Anne Ahlmann Nielsen ◽  
...  

AbstractIn recent years, the availability of reduced representation library (RRL) methods has catalysed an expansion of genome-scale studies to characterize both model and non-model organisms. Most of these methods rely on the use of restriction enzymes to obtain DNA sequences at a genome-wide level. These approaches have been widely used to sequence thousands of markers across individuals for many organisms at a reasonable cost, revolutionizing the field of population genomics. However, there are still some limitations associated with these methods, in particular, the high molecular weight DNA required as starting material, the reduced number of common loci among investigated samples, and the short length of the sequenced site-associated DNA. Here, we present MobiSeq, a RRL protocol exploiting simple laboratory techniques, that generates genomic data based on PCR targeted-enrichment of transposable elements and the sequencing of the associated flanking region. We validate its performance across 103 DNA extracts derived from three mammalian species: grey wolf (Canis lupus), red deer complex (Cervus sp.), and brown rat (Rattus norvegicus). MobiSeq enables the sequencing of hundreds of thousands loci across the genome, and performs SNP discovery with relatively low rates of clonality. Given the ease and flexibility of MobiSeq protocol, the method has the potential to be implemented for marker discovery and population genomics across a wide range of organisms – enabling the exploration of diverse evolutionary and conservation questions.


2017 ◽  
Author(s):  
Ingo Bulla ◽  
Benoît Aliaga ◽  
Virginia Lacal ◽  
Jan Bulla ◽  
Christoph Grunau ◽  
...  

AbstractBackgroundDNA methylation patterns store epigenetic information in the vast majority of eukaryotic species. The relatively high costs and technical challenges associated with the detection of DNA methylation however have created a bias in the number of methylation studies towards model organisms. Consequently, it remains challenging to infer kingdom-wide general rules about the functions and evolutionary conservation of DNA methylation. Methylated cytosine is often found in specific CpN dinucleotides, and the frequency distributions of, for instance, CpG observed/expected (CpG o/e) ratios have been used to infer DNA methylation types based on higher mutability of methylated CpG.ResultsPredominantly model-based approaches essentially founded on mixtures of Gaussian distributions are currently used to investigate questions related to the number and position of modes of CpG o/e ratios. These approaches require the selection of an appropriate criterion for determining the best model and will fail if empirical distributions are complex or even merely moderately skewed. We use a kernel density estimation (KDE) based technique for robust and precise characterization of complex CpN o/e distributions withouta prioriassumptions about the underlying distributions.ConclusionsWe show that KDE delivers robust descriptions of CpN o/e distributions. For straightforward processing, we have developed a Galaxy tool, called Notos and available at the ToolShed, that calculates these ratios of input FASTA files and fits a density to their empirical distribution. Based on the estimated density the number and shape of modes of the distribution is determined, providing a rational for the prediction of the number and the types of different methylation classes. Notos is written in R and Perl.


2002 ◽  
Vol 69 ◽  
pp. 117-134 ◽  
Author(s):  
Stuart M. Haslam ◽  
David Gems ◽  
Howard R. Morris ◽  
Anne Dell

There is no doubt that the immense amount of information that is being generated by the initial sequencing and secondary interrogation of various genomes will change the face of glycobiological research. However, a major area of concern is that detailed structural knowledge of the ultimate products of genes that are identified as being involved in glycoconjugate biosynthesis is still limited. This is illustrated clearly by the nematode worm Caenorhabditis elegans, which was the first multicellular organism to have its entire genome sequenced. To date, only limited structural data on the glycosylated molecules of this organism have been reported. Our laboratory is addressing this problem by performing detailed MS structural characterization of the N-linked glycans of C. elegans; high-mannose structures dominate, with only minor amounts of complex-type structures. Novel, highly fucosylated truncated structures are also present which are difucosylated on the proximal N-acetylglucosamine of the chitobiose core as well as containing unusual Fucα1–2Gal1–2Man as peripheral structures. The implications of these results in terms of the identification of ligands for genomically predicted lectins and potential glycosyltransferases are discussed in this chapter. Current knowledge on the glycomes of other model organisms such as Dictyostelium discoideum, Saccharomyces cerevisiae and Drosophila melanogaster is also discussed briefly.


PLoS ONE ◽  
2012 ◽  
Vol 7 (9) ◽  
pp. e44911 ◽  
Author(s):  
Tingjuan Gao ◽  
Jitka Petrlova ◽  
Wei He ◽  
Thomas Huser ◽  
Wieslaw Kudlick ◽  
...  
Keyword(s):  

Author(s):  
José Cerca ◽  
Marius F. Maurstad ◽  
Nicolas C. Rochette ◽  
Angel G. Rivera‐Colón ◽  
Niraj Rayamajhi ◽  
...  
Keyword(s):  
De Novo ◽  

Animals ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 2226
Author(s):  
Sazia Kunvar ◽  
Sylwia Czarnomska ◽  
Cino Pertoldi ◽  
Małgorzata Tokarska

The European bison is a non-model organism; thus, most of its genetic and genomic analyses have been performed using cattle-specific resources, such as BovineSNP50 BeadChip or Illumina Bovine 800 K HD Bead Chip. The problem with non-specific tools is the potential loss of evolutionary diversified information (ascertainment bias) and species-specific markers. Here, we have used a genotyping-by-sequencing (GBS) approach for genotyping 256 samples from the European bison population in Bialowieza Forest (Poland) and performed an analysis using two integrated pipelines of the STACKS software: one is de novo (without reference genome) and the other is a reference pipeline (with reference genome). Moreover, we used a reference pipeline with two different genomes, i.e., Bos taurus and European bison. Genotyping by sequencing (GBS) is a useful tool for SNP genotyping in non-model organisms due to its cost effectiveness. Our results support GBS with a reference pipeline without PCR duplicates as a powerful approach for studying the population structure and genotyping data of non-model organisms. We found more polymorphic markers in the reference pipeline in comparison to the de novo pipeline. The decreased number of SNPs from the de novo pipeline could be due to the extremely low level of heterozygosity in European bison. It has been confirmed that all the de novo/Bos taurus and Bos taurus reference pipeline obtained SNPs were unique and not included in 800 K BovineHD BeadChip.


Sign in / Sign up

Export Citation Format

Share Document