scholarly journals Identification and characterization of retro-DNAs, a new type of retrotransposons originated from DNA transposons, in primate genomes

2020 ◽  
Author(s):  
Wanxiangfu Tang ◽  
Ping Liang

AbstractMobile elements (MEs) can be divided into two major classes based on their transposition mechanisms as retrotransposons and DNA transposons. DNA transposons move in the genomes directly in the form of DNA in a cut-and-paste style, while retrotransposons utilize an RNA-intermediate to transpose in a “copy-and-paste” fashion. In addition to the target site duplications (TSDs), a hallmark of transposition shared by both classes, the DNA transposons also carry terminal inverted repeats (TIRs). DNA transposons constitute ~3% of primate genomes and they are thought to be inactive in the recent primate genomes since ~37My ago despite their success during early primate evolution. Retrotransposons can be further divided into Long Terminal Repeat retrotransposons (LTRs), which are characterized by the presence of LTRs at the two ends, and non-LTRs, which lack LTRs. In the primate genomes, LTRs constitute ~9% of genomes and have a low level of ongoing activity, while non-LTR retrotransposons represent the major types of MEs, contributing to ~37% of the genomes with some members being very young and currently active in retrotransposition. The four known types of non-LTR retrotransposons include LINEs, SINEs, SVAs, and processed pseudogenes, all characterized by the presence of a polyA tail and TSDs, which mostly range from 8 to 15 bp in length. All non-LTR retrotransposons are known to utilize the L1-based target-primed reverse transcription (TPRT) machineries for retrotransposition. In this study, we report a new type of non-LTR retrotransposon, which we named as retro-DNAs, to represent DNA transposons by sequence but non-LTR retrotransposons by the transposition mechanism in the recent primate genomes. By using a bioinformatics comparative genomics approach, we identified a total of 1,750 retro-DNAs, which represent 748 unique insertion events in the human genome and nine non-human primate genomes from the ape and monkey groups. These retro-DNAs, mostly as fragments of full-length DNA transposons, carry no TIRs but longer TSDs with ~23.5% also carrying a polyA tail and with their insertion site motifs and TSD length pattern characteristic of non-LTR retrotransposons. These features suggest that these retro-DNAs are DNA transposon sequences likely mobilized by the TPRT mechanism. Further, at least 40% of these retro-DNAs locate to genic regions, presenting significant potentials for impacting gene function. More interestingly, some retro-DNAs, as well as their parent sites, show certain levels of current transcriptional expression, suggesting that they have the potential to create more retro-DNAs in the current primate genomes. The identification of retro-DNAs, despite small in number, reveals a new mechanism in propagating the DNA transposons sequences in the primate genomes with the absence of canonical DNA transposon activity. It also suggests that the L1 TPRT machinery may have the ability to retrotranspose a wider variety of DNA sequences than what we currently know.

2007 ◽  
Vol 35 (3) ◽  
pp. 637-642 ◽  
Author(s):  
G.G. Schumann

Mammalian genomes are littered with enormous numbers of transposable elements interspersed within and between single-copy endogenous genes. The only presently spreading class of human transposable elements comprises non-LTR (long terminal repeat) retrotransposons, which cover approx. 34% of the human genome. Non-LTR retrotransposons include the widespread autonomous LINEs (long interspersed nuclear elements) and non-autonomous elements such as processed pseudogenes, SVAs [named after SINE (short interspersed nuclear element), VNTR (variable number of tandem repeats) and Alu] and SINEs. Mobilization of these elements affects the host genome, can be deleterious to the host cell, and cause genetic disorders and cancer. In order to limit negative effects of retrotransposition, host genomes have adopted several strategies to curb the proliferation of transposable elements. Recent studies have demonstrated that members of the human APOBEC3 (apolipoprotein B mRNA editing enzyme catalytic polypeptide 3) protein family inhibit the mobilization of the non-LTR retrotransposons LINE-1 and Alu significantly and participate in the intracellular defence against retrotransposition by mechanisms unknown to date. The striking coincidence between the expansion of the APOBEC3 gene cluster and the abrupt decline in retrotransposon activity in primates raises the possibility that these genes may have been expanded to prevent genomic instability caused by endogenous retroelements.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Oluchi Aroh ◽  
Kenneth M. Halanych

Abstract Background Long Terminal Repeat retrotransposons (LTR retrotransposons) are mobile genetic elements composed of a few genes between terminal repeats and, in some cases, can comprise over half of a genome’s content. Available data on LTR retrotransposons have facilitated comparative studies and provided insight on genome evolution. However, data are biased to model systems and marine organisms, including annelids, have been underrepresented in transposable elements studies. Here, we focus on genome of Lamellibrachia luymesi, a vestimentiferan tubeworm from deep-sea hydrocarbon seeps, to gain knowledge of LTR retrotransposons in a deep-sea annelid. Results We characterized LTR retrotransposons present in the genome of L. luymesi using bioinformatic approaches and found that intact LTR retrotransposons makes up about 0.1% of L. luymesi genome. Previous characterization of the genome has shown that this tubeworm hosts several known LTR-retrotransposons. Here we describe and classify LTR retrotransposons in L. luymesi as within the Gypsy, Copia and Bel-pao superfamilies. Although, many elements fell within already recognized families (e.g., Mag, CSRN1), others formed clades distinct from previously recognized families within these superfamilies. However, approximately 19% (41) of recovered elements could not be classified. Gypsy elements were the most abundant while only 2 Copia and 2 Bel-pao elements were present. In addition, analysis of insertion times indicated that several LTR-retrotransposons were recently transposed into the genome of L. luymesi, these elements had identical LTR’s raising possibility of recent or ongoing retrotransposon activity. Conclusions Our analysis contributes to knowledge on diversity of LTR-retrotransposons in marine settings and also serves as an important step to assist our understanding of the potential role of retroelements in marine organisms. We find that many LTR retrotransposons, which have been inserted in the last few million years, are similar to those found in terrestrial model species. However, several new groups of LTR retrotransposons were discovered suggesting that the representation of LTR retrotransposons may be different in marine settings. Further study would improve understanding of the diversity of retrotransposons across animal groups and environments.


2001 ◽  
Vol 114 (14) ◽  
pp. 2569-2575 ◽  
Author(s):  
Michael Hesse ◽  
Thomas M. Magin ◽  
Klaus Weber

We screened the draft sequence of the human genome for genes that encode intermediate filament (IF) proteins in general, and keratins in particular. The draft covers nearly all previously established IF genes including the recent cDNA and gene additions, such as pancreatic keratin 23, synemin and the novel muscle protein syncoilin. In the draft, seven novel type II keratins were identified, presumably expressed in the hair follicle/epidermal appendages. In summary, 65 IF genes were detected, placing IF among the 100 largest gene families in humans. All functional keratin genes map to the two known keratin clusters on chromosomes 12 (type II plus keratin 18) and 17 (type I), whereas other IF genes are not clustered. Of the 208 keratin-related DNA sequences, only 49 reflect true keratin genes, whereas the majority describe inactive gene fragments and processed pseudogenes. Surprisingly, nearly 90% of these inactive genes relate specifically to the genes of keratins 8 and 18. Other keratin genes, as well as those that encode non-keratin IF proteins, lack either gene fragments/pseudogenes or have only a few derivatives. As parasitic derivatives of mature mRNAs, the processed pseudogenes of keratins 8 and 18 have invaded most chromosomes, often at several positions. We describe the limits of our analysis and discuss the striking unevenness of pseudogene derivation in the IF multigene family. Finally, we propose to extend the nomenclature of Moll and colleagues to any novel keratin.


Mobile DNA ◽  
2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Shujun Ou ◽  
Ning Jiang

AbstractAnnotation of plant genomes is still a challenging task due to the abundance of repetitive sequences, especially long terminal repeat (LTR) retrotransposons. LTR_FINDER is a widely used program for the identification of LTR retrotransposons but its application on large genomes is hindered by its single-threaded processes. Here we report an accessory program that allows parallel operation of LTR_FINDER, resulting in up to 8500X faster identification of LTR elements. It takes only 72 min to process the 14.5 Gb bread wheat (Triticum aestivum) genome in comparison to 1.16 years required by the original sequential version. LTR_FINDER_parallel is freely available at https://github.com/oushujun/LTR_FINDER_parallel.


1986 ◽  
Vol 6 (5) ◽  
pp. 1520-1528 ◽  
Author(s):  
D Y Chang ◽  
B Wisely ◽  
S M Huang ◽  
R A Voelker

A hybrid dysgenesis-induced allele [su(s)w20] associated with a P-element insertion was used to clone sequences from the su(s) region of Drosophila melanogaster by means of the transposon-tagging technique. Cloned sequences were used to probe restriction enzyme-digested DNAs from 22 other su(s) mutations. None of three X-ray-induced or six ethyl methanesulfonate-induced su(s) mutations possessed detectable variation. Seven spontaneous, four hybrid dysgenesis-induced, and two DNA transformation-induced mutations were associated with insertions within 2.0 kilobases (kb) of the su(s)w20 P-element insertion site. When the region of DNA that included the mutational insertions was used to probe poly(A)+ RNAs, a 5-kb message was detected in wild-type RNA that was present in greatly reduced amounts in two su(s) mutations. By using strand-specific probes, the direction of transcription of the 5-kb message was determined. The mutational insertions lie in DNA sequences near the 5' end of the 5-kb message. Three of the seven spontaneous su(s) mutations are associated with gypsy insertions, but they are not suppressible by su(Hw).


1986 ◽  
Vol 6 (12) ◽  
pp. 4161-4167 ◽  
Author(s):  
M K Dush ◽  
J A Tischfield ◽  
S A Khan ◽  
E Feliciano ◽  
J M Sikela ◽  
...  

A mouse adenine phosphoribosyltransferase (aprt) pseudogene that had previously been recovered from a BALB/c sperm DNA library possessed several unusual features. Its nucleotide sequence, like that of other processed pseudogenes, was colinear with its corresponding mRNA, but it was truncated at its 3' end and lacked a poly(A) tail. The pseudogene was 82% homologous with corresponding regions of the functional gene and had incurred mutations that included transitions, transversions, deletions, and a point insertion. Even though the pseudogene was truncated within the protein-coding region of the corresponding functional gene, it was flanked at both ends by 13-base-pair direct repeats. Curiously, the direct repeats exhibited homology to APRT mRNA at the site of pseudogene divergence. The pseudogene appeared to be common to BALB/c and A/J mice, but it was contained on a 3-kilobase EcoRI fragment in the former strain and a 4.5-kilobase EcoRI fragment in the latter. The BALB/c and apparently the A/J pseudogene both mapped to chromosome 8, which also contains the functional aprt gene. The DNA sequences immediately surrounding the pseudogene in the two strains appeared to be similar, suggesting that the BALB/c and A/J pseudogenes are allelic. However, DNA sequences more distal to the pseudogene in the two strains appeared to vary. Thus, the EcoRI polymorphism was not due to simple loss of an EcoRI site, but was more complex. The pattern of flanking restriction sites was different for each of several enzymes, consistent with extensive DNA rearrangement. Double digests of BALB/c and A/J genomic DNAs revealed complex polymorphisms on both sides of the pseudogene. The results were consistent with insertion, deletion, or other rearrangement of DNA sequences that flank the pseudogene and suggest that this region of mouse chromosome 8 may be a region active for mutation or recombination.


2016 ◽  
Vol 7 (1) ◽  
Author(s):  
Thomas Wicker ◽  
Yeisoo Yu ◽  
Georg Haberer ◽  
Klaus F. X. Mayer ◽  
Pradeep Reddy Marri ◽  
...  

2012 ◽  
Vol 29 (12) ◽  
pp. 3685-3702 ◽  
Author(s):  
Irina Sormacheva ◽  
Georgiy Smyshlyaev ◽  
Vladimir Mayorov ◽  
Alexander Blinov ◽  
Anton Novikov ◽  
...  

2017 ◽  
Author(s):  
Michael P. McGurk ◽  
Daniel A. Barbash

AbstractEukaryotic genomes are replete with repeated sequences, in the form of transposable elements (TEs) dispersed across the genome or as satellite arrays, large stretches of tandemly repeated sequence. Many satellites clearly originated as TEs, but it is unclear how mobile genetic parasites can transform into megabase-sized tandem arrays. Comprehensive population genomic sampling is needed to determine the frequency and generative mechanisms of tandem TEs, at all stages from their initial formation to their subsequent expansion and maintenance as satellites. The best available population resources, short-read DNA sequences, are often considered to be of limited utility for analyzing repetitive DNA due to the challenge of mapping individual repeats to unique genomic locations. Here we develop a new pipeline called ConTExt which demonstrates that paired-end Illumina data can be successfully leveraged to identify a wide range of structural variation within repetitive sequence, including tandem elements. Analyzing 85 genomes from five populations of Drosophila melanogaster we discover that TEs commonly form tandem dimers. Our results further suggest that insertion site preference is the major mechanism by which dimers arise and that, consequently, dimers form rapidly during periods of active transposition. This abundance of TE dimers has the potential to provide source material for future expansion into satellite arrays, and we discover one such copy number expansion of the DNA transposon Hobo to ~16 tandem copies in a single line. The very process that defines TEs —transposition— thus regularly generates sequences from which new satellites can arise.


Sign in / Sign up

Export Citation Format

Share Document