scholarly journals RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements

GigaScience ◽  
2019 ◽  
Vol 8 (11) ◽  
Author(s):  
Ekaterina Osipova ◽  
Nikolai Hecker ◽  
Michael Hiller

Abstract Background Transposons and other repetitive sequences make up a large part of complex genomes. Repetitive sequences can be co-opted into a variety of functions and thus provide a source for evolutionary novelty. However, comprehensively detecting ancestral repeats that align between species is difficult because considering all repeat-overlapping seeds in alignment methods that rely on the seed-and-extend heuristic results in prohibitively high runtimes. Results Here, we show that ignoring repeat-overlapping alignment seeds when aligning entire genomes misses numerous alignments between repetitive elements. We present a tool, RepeatFiller, that improves genome alignments by incorporating previously undetected local alignments between repetitive sequences. By applying RepeatFiller to genome alignments between human and 20 other representative mammals, we uncover between 22 and 84 Mb of previously undetected alignments that mostly overlap transposable elements. We further show that the increased alignment coverage improves the annotation of conserved non-exonic elements, both by discovering numerous novel transposon-derived elements that evolve under constraint and by removing thousands of elements that are not under constraint in placental mammals. Conclusions RepeatFiller contributes to comprehensively aligning repetitive genomic regions, which facilitates studying transposon co-option and genome evolution. Source code: https://github.com/hillerlab/GenomeAlignmentTools

2019 ◽  
Author(s):  
Ekaterina Osipova ◽  
Nikolai Hecker ◽  
Michael Hiller

AbstractTransposons and other repetitive sequences make up a large part of complex genomes. Repetitive sequences can be co-opted into a variety of functions and thus provide a source for evolutionary novelty. However, comprehensively detecting ancestral repeats that align between species is difficult since considering all repeat-overlapping seeds in alignment methods that rely on the seed-and-extend heuristic results in prohibitively high runtimes. Here, we show that ignoring repeat-overlapping alignment seeds when aligning entire genomes misses numerous alignments between repetitive elements. We present a tool – RepeatFiller – that improves genome alignments by incorporating previously-undetected local alignments between repetitive sequences. By applying RepeatFiller to genome alignments between human and 20 other representative mammals, we uncover between 22 and 84 megabases of previously-undetected alignments that mostly overlap transposable elements. We further show that the increased alignment coverage improves the annotation of conserved non-exonic elements, both by discovering numerous novel transposon-derived elements that evolve under constraint and by removing thousands of elements that are not under constraint in placental mammals. In conclusion, RepeatFiller contributes to comprehensively aligning repetitive genomic regions, which facilitates studying transposon co-option and genome evolution.Source codehttps://github.com/hillerlab/GenomeAlignmentTools


2021 ◽  
Vol 1 (2) ◽  
pp. 1-9
Author(s):  
Ayan Mukherjee

Evolution of vertebrate species took shape through millions of years, where sex played an important role in maintenance of a lineage, genetic diversifications and reproductive isolation. On due course of sexual evolution, sex determination strategies have been proposed to flow from temperature dependent sex determination to genetic sex determination, which has been demonstrated as XY system in mammals and ZW system in birds. In contrary to this established conception, different lineages showed to have overlapping sex determining strategies. While searching possible reasons for these phenomenons, researchers observed that gene content of sex chromosomes is highly variable as far as their location and prevalence is concerned, which otherwise suggested autosomal origin of sex chromosomes. Although the exact mechanisms of gene transfer and thereby origin of sex chromosomes are yet to be unveiled, but chromosomal rearrangement and introgression has been hypothesized to be the possible effector. Transposable elements (TEs) are long been considered to be ‘Selfish’ or ‘Junk’ DNA material as most of the non-coding genomic regions are comprised by TEs, which did not make any sense to be a part of species genome. But recently, TEs are being considered to be a nature’s tool for biological innovation by creating new regulatory elements, new coding sequences, genetic disruption and chromosomal remodelling. So, this has been postulated that TEs could facilitate rearrangement and introgression, which ultimately lead to evolution of sex chromosomes and sex determining genes through positive selection. Prevalence of highly repetitive sequences in sex chromosomes, particularly in Y, makes it a hot bed for TEs mediated rearrangement and introgression. In this review, I tried to discuss whether it makes any sense to focus on the role of TEs in sexual evolution of animals.


Genetics ◽  
2002 ◽  
Vol 161 (4) ◽  
pp. 1661-1672 ◽  
Author(s):  
Andrea Pedrosa ◽  
Niels Sandal ◽  
Jens Stougaard ◽  
Dieter Schweizer ◽  
Andreas Bachmair

AbstractLotus japonicus is a model plant for the legume family. To facilitate map-based cloning approaches and genome analysis, we performed an extensive characterization of the chromosome complement of the species. A detailed karyotype of L. japonicus Gifu was built and plasmid and BAC clones, corresponding to genetically mapped markers (see the accompanying article by Sandal  et al. 2002, this issue), were used for FISH to correlate genetic and chromosomal maps. Hybridization of DNA clones from 32 different genomic regions enabled the assignment of linkage groups to chromosomes, the comparison between genetic and physical distances throughout the genome, and the partial characterization of different repetitive sequences, including telomeric and centromeric repeats. Additional analysis of L. filicaulis and its F1 hybrid with L. japonicus demonstrated the occurrence of inversions between these closely related species, suggesting that these chromosome rearrangements are early events in speciation of this group.


2021 ◽  
Vol 22 (1) ◽  
pp. 468
Author(s):  
Klára Konečná ◽  
Pavla Polanská Sováková ◽  
Karin Anteková ◽  
Jiří Fajkus ◽  
Miloslava Fojtová

Involvement of epigenetic mechanisms in the regulation of telomeres and transposable elements (TEs), genomic regions with the protective and potentially detrimental function, respectively, has been frequently studied. Here, we analyzed telomere lengths in Arabidopsis thaliana plants of Columbia, Landsberg erecta and Wassilevskija ecotypes exposed repeatedly to the hypomethylation drug zebularine during germination. Shorter telomeres were detected in plants growing from seedlings germinated in the presence of zebularine with a progression in telomeric phenotype across generations, relatively high inter-individual variability, and diverse responses among ecotypes. Interestingly, the extent of telomere shortening in zebularine Columbia and Wassilevskija plants corresponded to the transcriptional activation of TEs, suggesting a correlated response of these genomic elements to the zebularine treatment. Changes in lengths of telomeres and levels of TE transcripts in leaves were not always correlated with a hypomethylation of cytosines located in these regions, indicating a cytosine methylation-independent level of their regulation. These observations, including differences among ecotypes together with distinct dynamics of the reversal of the disruption of telomere homeostasis and TEs transcriptional activation, reflect a complex involvement of epigenetic processes in the regulation of crucial genomic regions. Our results further demonstrate the ability of plant cells to cope with these changes without a critical loss of the genome stability.


2021 ◽  
pp. gr.275658.121
Author(s):  
Yuyun Zhang ◽  
Zijuan Li ◽  
Yu'e Zhang ◽  
Kande Lin ◽  
Yuan Peng ◽  
...  

More than 80% of the wheat genome consists of transposable elements (TEs), which act as one major driver of wheat genome evolution. However, their contributions to the regulatory evolution of wheat adaptations remain largely unclear. Here, we created genome-binding maps for 53 transcription factors (TFs) underlying environmental responses by leveraging DAP-seq in Triticum urartu, together with epigenomic profiles. Most TF-binding sites (TFBS) located distally from genes are embedded in TEs, whose functional relevance is supported by purifying selection and active epigenomic features. About 24% of the non-TE TFBS share significantly high sequence similarity with TE-embedded TFBS. These non-TE TFBS have almost no homologous sequences in non-Triticeae species and are potentially derived from Triticeae-specific TEs. The expansion of TE-derived TFBS linked to wheat-specific gene responses, suggesting TEs are an important driving force for regulatory innovations. Altogether, TEs have been significantly and continuously shaping regulatory networks related to wheat genome evolution and adaptation.


2021 ◽  
Author(s):  
Matias Rodriguez ◽  
Wojciech Makałowski

AbstractTransposable elements (TEs) are major genomic components in most eukaryotic genomes and play an important role in genome evolution. However, despite their relevance the identification of TEs is not an easy task and a number of tools were developed to tackle this problem. To better understand how they perform, we tested several widely used tools for de novo TE detection and compared their performance on both simulated data and well curated genomic sequences. The results will be helpful for identifying common issues associated with TE-annotation and for evaluating how comparable are the results obtained with different tools.


2019 ◽  
Vol 35 (19) ◽  
pp. 3839-3841 ◽  
Author(s):  
Artem Babaian ◽  
I Richard Thompson ◽  
Jake Lever ◽  
Liane Gagnier ◽  
Mohammad M Karimi ◽  
...  

Abstract Summary Transposable elements (TEs) influence the evolution of novel transcriptional networks yet the specific and meaningful interpretation of how TE-derived transcriptional initiation contributes to the transcriptome has been marred by computational and methodological deficiencies. We developed LIONS for the analysis of RNA-seq data to specifically detect and quantify TE-initiated transcripts. Availability and implementation Source code, container, test data and instruction manual are freely available at www.github.com/ababaian/LIONS. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Shujun Ou ◽  
Weija Su ◽  
Yi Liao ◽  
Kapeel Chougule ◽  
Jireh R. A. Agda ◽  
...  

Abstract Background Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. Results We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F1. Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. Conclusions The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.


Sign in / Sign up

Export Citation Format

Share Document