scholarly journals PlantRep: a database of plant repetitive elements

Author(s):  
Xizhi Luo ◽  
Shiyu Chen ◽  
Yu Zhang

Abstract Key message We re-annotated repeats of 459 plant genomes and released a new database: PlantRep (http://www.plantrep.cn/). PlantRep sheds lights of repeat evolution and provides fundamental data for deep exploration of genome.

2019 ◽  
Author(s):  
Kokulapalan Wimalanathan ◽  
Carolyn J. Lawrence-Dill

AbstractAnnotating gene structures and functions to genome assemblies is a must to make assembly resources useful for biological inference. Gene Ontology (GO) term assignment is the most pervasively used functional annotation system, and new methods for GO assignment have improved the quality of GO-based function predictions. GOMAP, the Gene Ontology Meta Annotator for Plants (GOMAP) is an optimized, high-throughput, and reproducible pipeline for genome-scale GO annotation for plant genomes. GOMAP’s methods have been shown to expand and improve the number of genes annotated and annotations assigned per gene as well as the quality (based on F-score) of GO assignments in maize. Here we report on the pipeline’s availability and performance for annotating large, repetitive plant genomes and describe how to deploy GOMAP to annotate additional plant genomes. We containerized GOMAP to increase portability and reproducibility, and optimized its performance for HPC environments. GOMAP has been used to annotate multiple maize lines, and is currently being deployed to annotate other species including wheat, rice, barley, cotton, soy, and others. Instructions along with access to the GOMAP Singularity container are freely available online at https://gomap-singularity.readthedocs.io/en/latest/. A list of annotated genomes and links to data is maintained at https://dill-picl.org/projects/gomap/gomap-datasets/.


2019 ◽  
Author(s):  
Benjamin Istace ◽  
Caroline Belser ◽  
Jean-Marc Aury

ABSTRACTMotivationLong read sequencing and Bionano Genomics optical maps are two techniques that, when used together, make it possible to reconstruct entire chromosome or chromosome arms structure. However, the existing tools are often too conservative and organization of contigs into scaffolds is not always optimal.ResultsWe developed BiSCoT (Bionano SCaffolding COrrection Tool), a tool that post-processes files generated during a Bionano scaffolding in order to produce an assembly of greater contiguity and quality. BiSCoT was tested on a human genome and four publicly available plant genomes sequenced with Nanopore long reads and improved significantly the contiguity and quality of the assemblies. BiSCoT generates a fasta file of the assembly as well as an AGP file which describes the new organization of the input assembly.AvailabilityBiSCoT and improved assemblies are freely available on Github at http://www.genoscope.cns.fr/biscot and Pypi at https://pypi.org/project/biscot/.


2019 ◽  
Author(s):  
Christopher Pockrandt ◽  
Mai Alzamel ◽  
Costas S. Iliopoulos ◽  
Knut Reinert

AbstractWe present a fast and exact algorithm to compute the (k, e)-mappability. Its inverse, the (k, e)-frequency counts the number of occurrences of each k-mer with up to e errors in a sequence. The algorithm we present is a magnitude faster than the algorithm in the widely used GEM suite while not relying on heuristics, and can even compute the mappability for short k-mers on highly repetitive plant genomes. We also show that mappability can be computed on multiple sequences to identify marker genes illustrated by the example of E. coli strains. GenMap allows exporting the mappability information into different formats such as raw output, wig and bed files. The application and its C++ source code is available on https://github.com/cpockrandt/genmap.


2021 ◽  
Author(s):  
Xizhi Luo ◽  
Shiyu Chen ◽  
Yu Zhang

Abstract We re-annotated repeats of 459 plant genomes and released a new datatabase: PlantRep (http://www.plantrep.cn/). PlantRep sheds lights of repeat evolution and provides fundamental data for deep exploration of genome.


2001 ◽  
Vol 11 (4) ◽  
pp. 585-594
Author(s):  
Gernot Glöckner ◽  
Karol Szafranski ◽  
Thomas Winckler ◽  
Theodor Dingermann ◽  
Michael A. Quail ◽  
...  

In the course of determining the sequence of the Dictyostelium discoideum genome we have characterized in detail the quantity and nature of interspersed repetitive elements present in this species. Several of the most abundant small complex repeats and transposons (DIRS-1; TRE3-A,B; TRE5-A; skipper; Tdd-4; H3R) have been described previously. In our analysis we have identified additional elements. Thus, we can now present a complete list of complex repetitive elements in D. discoideum. All elements add up to 10% of the genome. Some of the newly described elements belong to established classes (TRE3-C, D; TRE5-B,C; DGLT-A,P; Tdd-5). However, we have also defined two new classes of DNA transposable elements (DDT and thug) that have not been described thus far. Based on the nucleotide amount, we calculated the least copy number in each family. These vary between <10 up to >200 copies. Unique sequences adjacent to the element ends and truncation points in elements gave a measure for the fragmentation of the elements. Furthermore, we describe the diversity of single elements with regard to polymorphisms and conserved structures. All elements show insertion preference into loci in which other elements of the same family reside. The analysis of the complex repeats is a valuable data resource for the ongoing assembly of whole D. discoideum chromosomes.[The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AF135841, AF298201, AF298202, AF298203, AF298204,AF298205, AF298206, AF298207, AF298208, AF298209, AF298210 and AF298624.]


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1194
Author(s):  
Daniel Longhi Fernandes Pedro ◽  
Tharcisio Soares Amorim ◽  
Alessandro Varani ◽  
Romain Guyot ◽  
Douglas Silva Domingues ◽  
...  

Advances in genomic sequencing have recently offered vast opportunities for biological exploration, unraveling the evolution and improving our understanding of Earth biodiversity. Due to distinct plant species characteristics in terms of genome size, ploidy and heterozygosity, transposable elements (TEs) are common characteristics of many genomes. TEs are ubiquitous and dispersed repetitive DNA sequences that frequently impact the evolution and composition of the genome, mainly due to their redundancy and rearrangements. For this study, we provided an atlas of TE data by employing an easy-to-use portal (APTE website). To our knowledge, this is the most extensive and standardized analysis of TEs in plant genomes. We evaluated 67 plant genomes assembled at chromosome scale, recovering a total of 49,802,023 TE records, representing a total of 47,992,091,043 (~47,62%) base pairs (bp) of the total genomic space. We observed that new types of TEs were identified and annotated compared to other data repositories. By establishing a standardized catalog of TE annotation on 67 genomes, new hypotheses, exploration of TE data and their influences on the genomes may allow a better understanding of their function and processes. All original code and an example of how we developed the TE annotation strategy is available on GitHub (Extended data).


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10150
Author(s):  
Benjamin Istace ◽  
Caroline Belser ◽  
Jean-Marc Aury

Motivation Long read sequencing and Bionano Genomics optical maps are two techniques that, when used together, make it possible to reconstruct entire chromosome or chromosome arms structure. However, the existing tools are often too conservative and organization of contigs into scaffolds is not always optimal. Results We developed BiSCoT (Bionano SCaffolding COrrection Tool), a tool that post-processes files generated during a Bionano scaffolding in order to produce an assembly of greater contiguity and quality. BiSCoT was tested on a human genome and four publicly available plant genomes sequenced with Nanopore long reads and improved significantly the contiguity and quality of the assemblies. BiSCoT generates a fasta file of the assembly as well as an AGP file which describes the new organization of the input assembly. Availability BiSCoT and improved assemblies are freely available on GitHub at http://www.genoscope.cns.fr/biscot and Pypi at https://pypi.org/project/biscot/.


2019 ◽  
Author(s):  
Ekaterina Osipova ◽  
Nikolai Hecker ◽  
Michael Hiller

AbstractTransposons and other repetitive sequences make up a large part of complex genomes. Repetitive sequences can be co-opted into a variety of functions and thus provide a source for evolutionary novelty. However, comprehensively detecting ancestral repeats that align between species is difficult since considering all repeat-overlapping seeds in alignment methods that rely on the seed-and-extend heuristic results in prohibitively high runtimes. Here, we show that ignoring repeat-overlapping alignment seeds when aligning entire genomes misses numerous alignments between repetitive elements. We present a tool – RepeatFiller – that improves genome alignments by incorporating previously-undetected local alignments between repetitive sequences. By applying RepeatFiller to genome alignments between human and 20 other representative mammals, we uncover between 22 and 84 megabases of previously-undetected alignments that mostly overlap transposable elements. We further show that the increased alignment coverage improves the annotation of conserved non-exonic elements, both by discovering numerous novel transposon-derived elements that evolve under constraint and by removing thousands of elements that are not under constraint in placental mammals. In conclusion, RepeatFiller contributes to comprehensively aligning repetitive genomic regions, which facilitates studying transposon co-option and genome evolution.Source codehttps://github.com/hillerlab/GenomeAlignmentTools


Sign in / Sign up

Export Citation Format

Share Document