Simple Sequence Repeats in the Human Genome: Evolution

AbstractMobile elements (MEs) collectively constituted to at least 51% of the human genome. Due to their past incremental accumulation and ongoing DNA transposition of members from certain subfamilies, MEs serve as a significant source for both inter- and intra-species genetic diversity during primate and human evolution. Since MEs can exert direct impact on gene function via a plethora of mechanism, it is believed that the ME-derived genetic diversity has contributed to the phenotypic differences between human and non-human primates, as well as among human populations and individuals. To define the specific contribution of MEs in making Human sapiens as a biologically unique species, we aim to compile a complete list of MEs that are only uniquely present in the human genome, i.e., human-specific MEs (HS-MEs).By making use of the most recent reference genome sequences for human and many other primates and a unbiased more robust and integrative multi-way comparative genomic approach, we identified a total of 15,463 HS-MEs. This list of HS-MEs represents a 120% increase from prior studies with over 8,000 being newly identified as HS-MEs. Collectively, these ~15,000 HS-MEs have contributed to a total of 15 million base pair (Mbp) sequence increase through insertion, generation of target site duplications, and transductions, as well as a 0.5 Mbp sequence loss via insertion- mediated deletions, leading to a net total of 14.5 Mbp genome size increase. Other new observations made with these HS-MEs include: 1) identification of several additional ME subfamilies with significant transposition activities not visible with prior smaller datasets (e.g. L1HS, L1PA2, and HERV-K); 2) A clear similarity of the retrotransposition mechanism among L1, Alus, and SVAs that is distinct from HERVs based on the pre- integration site sequence motifs; 3) Y-chromosome as a strikingly hot target for HS-MEs, particularly for LTRs, which showed an insertion rate 15 times higher than the genome average; 4) among the ME types, SVAs seem to show a very strong bias in inserting into existing SVAs. Among the HS-MEs, more than 8,000 elements were integrated into the vicinity of ~4900 unique genes, in regions including CDS, untranslated exon regions, promoters, and introns of protein coding genes, as well as promoters and exons of non- coding RNAs. In seven cases, MEs participate in protein coding. Furthermore, 1,213 HS-MEs contributed to a total of 3,124 experimentally identified binding sites for 146 of the 161 transcriptional factors in association with 622 genes. All these data suggest that these HS-MEs, despite being very young, already showed sufficient sign for their participation in gene function via regulation of transcription, splicing, and protein coding, with more potential for future participation.In conclusion, our results demonstrate that the amount of MEs uniquely occurred in the human genome is much higher than previously known, and we predict that the same is true regarding their impact on human genome evolution and function. The comprehensive list of HS-MEs provides an important reference resource for studying the impact of DNA transposition in human genome evolution and gene function.

Download Full-text

The Impact of Gene Duplication on Human Genome Evolution

Encyclopedia of Life Sciences ◽

10.1002/9780470015902.a0020841 ◽

2008 ◽

Author(s):

James A Cotton

Keyword(s):

Gene Duplication ◽

Human Genome ◽

Genome Evolution ◽

Human Genome Evolution ◽

The Impact

Download Full-text

SSRD: Simple Sequence Repeats Database of the Human Genome

Comparative and Functional Genomics ◽

10.1002/cfg.289 ◽

2003 ◽

Vol 4 (3) ◽

pp. 342-345 ◽

Cited By ~ 16

Author(s):

Subbaya Subramanian ◽

Vamsi M. Madgula ◽

Ranjan George ◽

Satish Kumar ◽

Madhusudhan W. Pandit ◽

...

Keyword(s):

Human Genome ◽

Simple Sequence Repeats ◽

Biological Significance ◽

Repeat Sequence ◽

Easy Access ◽

Evolutionary Significance ◽

Intergenic Regions ◽

Genomic Regions ◽

Abundance And Distribution ◽

Simple Sequence

Simple sequence repeats are predominantly found in most organisms. They play a major role in studies of genetic diversity, and are useful as diagnostic markers for many diseases. The simple sequence repeats database (SSRD) for the human genome was created for easy access to such repeats, for analysis, and to be used to understand their biological significance. The data includes the abundance and distribution of SSRs in the coding and non-coding regions of the genome, as well as their association with the UTRs of genes. The exact locations of repeats with respect to genomic regions (such as UTRs, exons, introns or intergenic regions) and their association with STS markers are also highlighted. The resource will facilitate repeat sequence analysis in the human genome and the understanding of the functional and evolutionary significance of simple sequence repeats. SSRD is available through two websites, http://www.ccmb.res.in/ssr and http://www.ingenovis.com/ssr.

Download Full-text

Chromatin Structure and Human Genome Evolution

Encyclopedia of Life Sciences ◽

10.1002/9780470015902.a0020999.pub2 ◽

2013 ◽

Author(s):

Emily V Chambers ◽

Colin AM Semple

Keyword(s):

Human Genome ◽

Genome Evolution ◽

Chromatin Structure ◽

Human Genome Evolution

Download Full-text