scholarly journals Transposable Elements Are a Significant Contributor to Tandem Repeats in the Human Genome

2012 ◽  
Vol 2012 ◽  
pp. 1-7 ◽  
Author(s):  
Musaddeque Ahmed ◽  
Ping Liang

Sequence repeats are an important phenomenon in the human genome, playing important roles in genomic alteration often with phenotypic consequences. The two major types of repeat elements in the human genome are tandem repeats (TRs) including microsatellites, minisatellites, and satellites and transposable elements (TEs). So far, very little has been known about the relationship between these two types of repeats. In this study, we identified TRs that are derived from TEs either based on sequence similarity or overlapping genomic positions. We then analyzed the distribution of these TRs among TE families/subfamilies. Our study shows that at least 7,276 TRs or 23% of all minisatellites/satellites is derived from TEs, contributing ∼0.32% of the human genome. TRs seem to be generated more likely from younger/more active TEs, and once initiated they are expanded with time via local duplication of the repeat units. The currently postulated mechanisms for origin of TRs can explain only 6% of all TE-derived TRs, indicating the presence of one or more yet to be identified mechanisms for the initiation of such repeats. Our result suggests that TEs are contributing to genome expansion and alteration not only by transposition but also by generating tandem repeats.

2016 ◽  
Author(s):  
Lihai Ye ◽  
Xiaojun Tang ◽  
Yiyi Chen ◽  
Li Ren ◽  
Fangzhou Hu ◽  
...  

AbstractThe formation of the allotetraploid hybrid lineage (4nAT) encompasses both distant hybridization and polyploidization processes. The allotetraploid offspring have two sets of sub-genomes inherited from both parental species and therefore it is important to explore its genetic structure. Herein, we construct a bacterial artificial chromosome library of allotetraploids, and then sequence and analyze the full-length sequences of 19 bacterial artificial chromosomes. Sixty-eight DNA chimeras are identified, which are divided into four models according to the distribution of the genomic DNA derived from the parents. Among the 68 genetic chimeras, 44 (64.71%) are linked to tandem repeats (TRs) and 23 (33.82%) are linked to transposable elements (TEs). The chimeras linked to TRs are related to slipped-strand mispairing and double-strand break repair while the chimeras linked to TEs are benefit from the intervention of recombinases. In addition, TRs and TEs are linked not only with the recombinations, but also with the insertions/deletions of DNA segments. We conclude that DNA chimeras accompanied by TRs and TEs coordinate a balance between the sub-genomes derived from the parents which reduces the genomic shock effects and favors the evolutionary and adaptive capacity of the allotetraploidization. It is the first report on the relationship between formation of the DNA chimeras and TRs and TEs in the polyploid animals.


2001 ◽  
Vol 11 (13) ◽  
pp. 1017-1027 ◽  
Author(s):  
Alexei A. Aravin ◽  
Natalia M. Naumova ◽  
Alexei V. Tulin ◽  
Vasilii V. Vagin ◽  
Yakov M. Rozovsky ◽  
...  

2021 ◽  
pp. gr.275658.121
Author(s):  
Yuyun Zhang ◽  
Zijuan Li ◽  
Yu'e Zhang ◽  
Kande Lin ◽  
Yuan Peng ◽  
...  

More than 80% of the wheat genome consists of transposable elements (TEs), which act as one major driver of wheat genome evolution. However, their contributions to the regulatory evolution of wheat adaptations remain largely unclear. Here, we created genome-binding maps for 53 transcription factors (TFs) underlying environmental responses by leveraging DAP-seq in Triticum urartu, together with epigenomic profiles. Most TF-binding sites (TFBS) located distally from genes are embedded in TEs, whose functional relevance is supported by purifying selection and active epigenomic features. About 24% of the non-TE TFBS share significantly high sequence similarity with TE-embedded TFBS. These non-TE TFBS have almost no homologous sequences in non-Triticeae species and are potentially derived from Triticeae-specific TEs. The expansion of TE-derived TFBS linked to wheat-specific gene responses, suggesting TEs are an important driving force for regulatory innovations. Altogether, TEs have been significantly and continuously shaping regulatory networks related to wheat genome evolution and adaptation.


1985 ◽  
Vol 5 (3) ◽  
pp. 457-465
Author(s):  
M Heller ◽  
E Flemington ◽  
E Kieff ◽  
P Deininger

We isolated clones and determined the sequence of portions of mouse and human cellular DNA which cross-hybridize strongly with the IR3 repetitive region of Epstein-Barr virus. The sequences were found to be tandem arrays of a simple sequence based on the triplet GGA, very similar to the IR3 repeat. The cellular repeats have distinct differences from the viral repeat region, however, and their sequences do not appear capable of being translated into a purely glycine-plus-alanine protein domain like the portion of the Epstein-Barr nuclear antigen coded by IR3. Although the relationship between IR3 and the cellular repeats is left unclear, the cellular repeats have many interesting features. The tandem arrays are about 1 to several kilobases long, much shorter than satellite tandem repeats and larger than other interspersed, tandem repeats. Each of the repeats is a distinct variation, perhaps diverged from a common sequence, (GGA)n. This family is present in the genomes of all species tested and appears to be a ubiquitous feature of all higher eucaryotic genomes.


2020 ◽  
Vol 21 (20) ◽  
pp. 7749
Author(s):  
Marcela Suárez-Esquivel ◽  
Esteban Chaves-Olarte ◽  
Edgardo Moreno ◽  
Caterina Guzmán-Verri

Brucella organisms are responsible for one of the most widespread bacterial zoonoses, named brucellosis. The disease affects several species of animals, including humans. One of the most intriguing aspects of the brucellae is that the various species show a ~97% similarity at the genome level. Still, the distinct Brucella species display different host preferences, zoonotic risk, and virulence. After 133 years of research, there are many aspects of the Brucella biology that remain poorly understood, such as host adaptation and virulence mechanisms. A strategy to understand these characteristics focuses on the relationship between the genomic diversity and host preference of the various Brucella species. Pseudogenization, genome reduction, single nucleotide polymorphism variation, number of tandem repeats, and mobile genetic elements are unveiled markers for host adaptation and virulence. Understanding the mechanisms of genome variability in the Brucella genus is relevant to comprehend the emergence of pathogens.


2012 ◽  
Vol 10 (4) ◽  
pp. 3-13
Author(s):  

The paper describes the early part of Barbara McClintock`s work on DNA transposons in maize, in which she discovered the Ac-Ds family of mobile "controlling elements". An account is first given of the cytology of the system that was used to generate intact chromosomes having "sticky" (broken) ends. Cytogenetical aspects of the chromatid and chromosome breakage-fusion-bridge cycles, deriving from breakage, are then described, which leads on to the way in which variegation in phenotypes of the maize kernels could be "read" in terms of chromosome breakage. The "genetic earthquake" event of 1944, triggered by introducing broken chromosomes into a zygote from both parents, lead to the discovery of Ds and Ac. Finding mobility of Ds from one chromosomal location to another was pure serendipity: the transposition showed itself while experiments were being undertaken to accurately map Ds. A similar chance observation revealed transposition of Ac as well, and then the relationship between the two elements was elucidated in terms of their autonomous and non-autonomous nature.


1999 ◽  
Vol 19 (1) ◽  
pp. 873-881 ◽  
Author(s):  
O. N. Danilevskaya ◽  
K. L. Traverse ◽  
N. C. Hogan ◽  
P. G. DeBaryshe ◽  
M. L. Pardue

ABSTRACT The transposable elements HeT-A and TARTconstitute the telomeres of Drosophila chromosomes. Both are non-long terminal repeat (LTR) retrotransposons, sharing the remarkable property of transposing only to chromosome ends. In addition, strong sequence similarity of their gag proteins indicates that these coding regions share a common ancestor. These findings led to the assumption that HeT-A andTART are closely related. However, we now find that these elements produce quite different sets of transcripts. HeT-Aproduces only sense-strand transcripts of the full-length element, whereas TART produces both sense and antisense full-length RNAs, with antisense transcripts in more than 10-fold excess over sense RNA. In addition, features of TART sequence organization resemble those of a subclass of non-LTR elements characterized by unequal terminal repeats. Thus, the ancestral gag sequence appears to have become incorporated in two different types of elements, possibly with different functions in the telomere. HeT-Atranscripts are found in both nuclear and cytoplasmic cell fractions, consistent with roles as both mRNA and transposition template. In contrast, both sense and antisense TART transcripts are almost entirely concentrated in nuclear fractions. Also,TART open reading frame 2 probes detect a cytoplasmic mRNA for reverse transcriptase (RT), with no similarity to TARTsequence 5′ or 3′ of the RT coding region. This RNA could be a processed TART transcript or the product of a “free-standing” RT gene. Either origin would be novel. The distinctive transcription patterns of both HeT-A andTART are conserved in Drosophila yakuba, despite significant sequence divergence. The conservation argues that these sets of transcripts are important to the function(s) ofHeT-A and TART.


2020 ◽  
Vol 49 (D1) ◽  
pp. D452-D457
Author(s):  
Lisanna Paladin ◽  
Martina Bevilacqua ◽  
Sara Errigo ◽  
Damiano Piovesan ◽  
Ivan Mičetić ◽  
...  

Abstract The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.


Genes ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 352 ◽  
Author(s):  
Karen H. Miga

The central goal of medical genomics is to understand the inherited basis of sequence variation that underlies human physiology, evolution, and disease. Functional association studies currently ignore millions of bases that span each centromeric region and acrocentric short arm. These regions are enriched in long arrays of tandem repeats, or satellite DNAs, that are known to vary extensively in copy number and repeat structure in the human population. Satellite sequence variation in the human genome is often so large that it is detected cytogenetically, yet due to the lack of a reference assembly and informatics tools to measure this variability, contemporary high-resolution disease association studies are unable to detect causal variants in these regions. Nevertheless, recently uncovered associations between satellite DNA variation and human disease support that these regions present a substantial and biologically important fraction of human sequence variation. Therefore, there is a pressing and unmet need to detect and incorporate this uncharacterized sequence variation into broad studies of human evolution and medical genomics. Here I discuss the current knowledge of satellite DNA variation in the human genome, focusing on centromeric satellites and their potential implications for disease.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Ran Li ◽  
Xiaomeng Tian ◽  
Peng Yang ◽  
Yingzhi Fan ◽  
Ming Li ◽  
...  

Abstract Background The non-reference sequences (NRS) represent structure variations in human genome with potential functional significance. However, besides the known insertions, it is currently unknown whether other types of structure variations with NRS exist. Results Here, we compared 31 human de novo assemblies with the current reference genome to identify the NRS and their location. We resolved the precise location of 6113 NRS adding up to 12.8 Mb. Besides 1571 insertions, we detected 3041 alternate alleles, which were defined as having less than 90% (or none) identity with the reference alleles. These alternate alleles overlapped with 1143 protein-coding genes including a putative novel MHC haplotype. Further, we demonstrated that the alternate alleles and their flanking regions had high content of tandem repeats, indicating that their origin was associated with tandem repeats. Conclusions Our study detected a large number of NRS including many alternate alleles which are previously uncharacterized. We suggested that the origin of alternate alleles was associated with tandem repeats. Our results enriched the spectrum of genetic variations in human genome.


Sign in / Sign up

Export Citation Format

Share Document