scholarly journals Contribution of mobile elements to the uniqueness of human genome with more than 15,000 human-specific insertions

2016 ◽  
Author(s):  
Wanxiangfu Tang ◽  
Seyoung Mun ◽  
Adiya Joshi ◽  
Kyundong Han ◽  
Ping Liang

AbstractMobile elements (MEs) collectively constituted to at least 51% of the human genome. Due to their past incremental accumulation and ongoing DNA transposition of members from certain subfamilies, MEs serve as a significant source for both inter- and intra-species genetic diversity during primate and human evolution. Since MEs can exert direct impact on gene function via a plethora of mechanism, it is believed that the ME-derived genetic diversity has contributed to the phenotypic differences between human and non-human primates, as well as among human populations and individuals. To define the specific contribution of MEs in making Human sapiens as a biologically unique species, we aim to compile a complete list of MEs that are only uniquely present in the human genome, i.e., human-specific MEs (HS-MEs).By making use of the most recent reference genome sequences for human and many other primates and a unbiased more robust and integrative multi-way comparative genomic approach, we identified a total of 15,463 HS-MEs. This list of HS-MEs represents a 120% increase from prior studies with over 8,000 being newly identified as HS-MEs. Collectively, these ~15,000 HS-MEs have contributed to a total of 15 million base pair (Mbp) sequence increase through insertion, generation of target site duplications, and transductions, as well as a 0.5 Mbp sequence loss via insertion- mediated deletions, leading to a net total of 14.5 Mbp genome size increase. Other new observations made with these HS-MEs include: 1) identification of several additional ME subfamilies with significant transposition activities not visible with prior smaller datasets (e.g. L1HS, L1PA2, and HERV-K); 2) A clear similarity of the retrotransposition mechanism among L1, Alus, and SVAs that is distinct from HERVs based on the pre- integration site sequence motifs; 3) Y-chromosome as a strikingly hot target for HS-MEs, particularly for LTRs, which showed an insertion rate 15 times higher than the genome average; 4) among the ME types, SVAs seem to show a very strong bias in inserting into existing SVAs. Among the HS-MEs, more than 8,000 elements were integrated into the vicinity of ~4900 unique genes, in regions including CDS, untranslated exon regions, promoters, and introns of protein coding genes, as well as promoters and exons of non- coding RNAs. In seven cases, MEs participate in protein coding. Furthermore, 1,213 HS-MEs contributed to a total of 3,124 experimentally identified binding sites for 146 of the 161 transcriptional factors in association with 622 genes. All these data suggest that these HS-MEs, despite being very young, already showed sufficient sign for their participation in gene function via regulation of transcription, splicing, and protein coding, with more potential for future participation.In conclusion, our results demonstrate that the amount of MEs uniquely occurred in the human genome is much higher than previously known, and we predict that the same is true regarding their impact on human genome evolution and function. The comprehensive list of HS-MEs provides an important reference resource for studying the impact of DNA transposition in human genome evolution and gene function.

2009 ◽  
Vol 10 (10) ◽  
pp. 691-703 ◽  
Author(s):  
Richard Cordaux ◽  
Mark A. Batzer

2015 ◽  
Vol 14s1 ◽  
pp. CIN.S24657
Author(s):  
Wan-Ping Lee ◽  
Jiantao Wu ◽  
Gabor T. Marth

Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram .


2009 ◽  
Vol 35 (6) ◽  
pp. 702-710
Author(s):  
A. L. Amosova ◽  
A. Yu. Komkov ◽  
S. V. Ustyugova ◽  
I. Z. Mamedov ◽  
Yu. B. Lebedev

2014 ◽  
Vol 13s4 ◽  
pp. CIN.S13979 ◽  
Author(s):  
Wan-Ping Lee ◽  
Jiantao Wu ◽  
Gabor T. Marth

Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram .


Sign in / Sign up

Export Citation Format

Share Document