scholarly journals Discovery of unfixed endogenous retrovirus insertions in diverse human populations

2016 ◽  
Vol 113 (16) ◽  
pp. E2326-E2334 ◽  
Author(s):  
Julia Halo Wildschutte ◽  
Zachary H. Williams ◽  
Meagan Montesion ◽  
Ravi P. Subramanian ◽  
Jeffrey M. Kidd ◽  
...  

Endogenous retroviruses (ERVs) have contributed to more than 8% of the human genome. The majority of these elements lack function due to accumulated mutations or internal recombination resulting in a solitary (solo) LTR, although members of one group of human ERVs (HERVs), HERV-K, were recently active with members that remain nearly intact, a subset of which is present as insertionally polymorphic loci that include approximately full-length (2-LTR) and solo-LTR alleles in addition to the unoccupied site. Several 2-LTR insertions have intact reading frames in some or all genes that are expressed as functional proteins. These properties reflect the activity of HERV-K and suggest the existence of additional unique loci within humans. We sought to determine the extent to which other polymorphic insertions are present in humans, using sequenced genomes from the 1000 Genomes Project and a subset of the Human Genome Diversity Project panel. We report analysis of a total of 36 nonreference polymorphic HERV-K proviruses, including 19 newly reported loci, with insertion frequencies ranging from <0.0005 to >0.75 that varied by population. Targeted screening of individual loci identified three new unfixed 2-LTR proviruses within our set, including an intact provirus present at Xq21.33 in some individuals, with the potential for retained infectivity.

2020 ◽  
Vol 12 (6) ◽  
pp. 779-794 ◽  
Author(s):  
W Scott Watkins ◽  
Julie E Feusier ◽  
Jainy Thomas ◽  
Clement Goubert ◽  
Swapon Mallick ◽  
...  

Abstract Ongoing retrotransposition of Alu, LINE-1, and SINE–VNTR–Alu elements generates diversity and variation among human populations. Previous analyses investigating the population genetics of mobile element insertions (MEIs) have been limited by population ascertainment bias or by relatively small numbers of populations and low sequencing coverage. Here, we use 296 individuals representing 142 global populations from the Simons Genome Diversity Project (SGDP) to discover and characterize MEI diversity from deeply sequenced whole-genome data. We report 5,742 MEIs not originally reported by the 1000 Genomes Project and show that high sampling diversity leads to a 4- to 7-fold increase in MEI discovery rates over the original 1000 Genomes Project data. As a result of negative selection, nonreference polymorphic MEIs are underrepresented within genes, and MEIs within genes are often found in the transcriptional orientation opposite that of the gene. Globally, 80% of Alu subfamilies predate the expansion of modern humans from Africa. Polymorphic MEIs show heterozygosity gradients that decrease from Africa to Eurasia to the Americas, and the number of MEIs found uniquely in a single individual are also distributed in this general pattern. The maximum fraction of MEI diversity partitioned among the seven major SGDP population groups (FST) is 7.4%, similar to, but slightly lower than, previous estimates and likely attributable to the diverse sampling strategy of the SGDP. Finally, we utilize these MEIs to extrapolate the primary Native American shared ancestry component to back to Asia and provide new evidence from genome-wide identical-by-descent genetic markers that add additional support for a southeastern Siberian origin for most Native Americans.


2000 ◽  
Vol 74 (8) ◽  
pp. 3715-3730 ◽  
Author(s):  
Michael Tristem

ABSTRACT Human endogenous retroviruses (HERVs) were first identified almost 20 years ago, and since then numerous families have been described. It has, however, been difficult to obtain a good estimate of both the total number of independently derived families and their relationship to each other as well as to other members of the familyRetroviridae. In this study, I used sequence data derived from over 150 novel HERVs, obtained from the Human Genome Mapping Project database, and a variety of recently identified nonhuman retroviruses to classify the HERVs into 22 independently acquired families. Of these, 17 families were loosely assigned to the class I HERVs, 3 to the class II HERVs and 2 to the class III HERVs. Many of these families have been identified previously, but six are described here for the first time and another four, for which only partial sequence information was previously available, were further characterized. Members of each of the 10 families are defective, and calculation of their integration dates suggested that most of them are likely to have been present within the human lineage since it diverged from the Old World monkeys more than 25 million years ago.


2003 ◽  
Vol 77 (20) ◽  
pp. 11268-11273 ◽  
Author(s):  
Nikolai Klymiuk ◽  
Mathias Müller ◽  
Gottfried Brem ◽  
Bernhard Aigner

ABSTRACT Endogenous retrovirus (ERV) sequences have been found in all mammals. In vitro and in vivo experiments revealed ERV activation and cross-species infection in several species. Sheep (Ovis aries) are used for various biotechnological purposes; however, they have not yet been comprehensively screened for ERV sequences. Therefore, the aim of the study was to classify the ERV sequences in the ovine genome (OERV) by analyzing the retroviral pro-pol sequences. Three OERV β families and nine OERV γ families were revealed. Novel open reading frames (ORF) in the amplified proviral fragment were found in one OERV β family and two OERV γ families. Hybrid OERV produced by putative recombination events were not detected. Quantitative analysis of the OERV sequences in the ovine genome revealed no relevant variations in the endogenous retroviral loads of different breeds. Expression analysis of different tissues from fetal and pregnant sheep detected mRNA from both gammaretrovirus families, showing ORF fragments. Thus, the release of retroviruses from sheep cells cannot be excluded.


2019 ◽  
Vol 37 (1) ◽  
pp. 2-10 ◽  
Author(s):  
Luke Anderson-Trocmé ◽  
Rick Farouni ◽  
Mathieu Bourgey ◽  
Yoichiro Kamatani ◽  
Koichiro Higasa ◽  
...  

Abstract Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.


2004 ◽  
Vol 78 (16) ◽  
pp. 8788-8798 ◽  
Author(s):  
Laurence Lavie ◽  
Patrik Medstrand ◽  
Werner Schempp ◽  
Eckart Meese ◽  
Jens Mayer

ABSTRACT The human genome harbors numerous distinct families of so-called human endogenous retroviruses (HERV) which are remnants of exogenous retroviruses that entered the germ line millions of years ago. We describe here the hitherto little-characterized betaretrovirus HERV-K(HML-5) family (named HERVK22 in Repbase) in greater detail. Out of 139 proviruses, only a few loci represent full-length proviruses, and many lack gag protease and/or env gene regions. We generated a consensus sequence from multiple alignment of 62 HML-5 loci that displays open reading frames for the four major retroviral proteins. Four HML-5 long terminal repeat (LTR) subfamilies were identified that are associated with monophyletic proviral bodies, implying different evolution of HML-5 LTRs and genes. Sequence analysis indicated that the proviruses formed approximately 55 million years ago. Accordingly, HML-5 proviral sequences were detected in Old World and New World primates but not in prosimians. No recent activity is associated with this HERV family. We also conclude that the HML-5 consensus sequence primer binding site is identical to methionine tRNA. Therefore, the family should be designated HERV-M. Our study provides important insights into the structure and evolution of the oldest betaretrovirus in the primate genome known to date.


2005 ◽  
Vol 79 (5) ◽  
pp. 2941-2949 ◽  
Author(s):  
Aline Flockerzi ◽  
Stefan Burkhardt ◽  
Werner Schempp ◽  
Eckart Meese ◽  
Jens Mayer

ABSTRACT The human genome harbors many distinct families of human endogenous retroviruses (HERVs) that stem from exogenous retroviruses that infected the germ line millions of years ago. Many HERV families remain to be investigated. We report in the present study the detailed characterization of the HERV-K14I and HERV-K14CI families as they are represented in the human genome. Most of the 68 HERV-K14I and 23 HERV-K14CI proviruses are severely mutated, frequently displaying uniform deletions of retroviral genes and long terminal repeats (LTRs). Both HERV families entered the germ line ∼39 million years ago, as evidenced by homologous sequences in hominoids and Old World primates and calculation of evolutionary ages based on a molecular clock. Proviruses of both families were formed during a brief period. A majority of HERV-K14CI proviruses on the Y chromosome mimic a higher evolutionary age, showing that LTR-LTR divergence data can indicate false ages. Fully translatable consensus sequences encoding major retroviral proteins were generated. Most HERV-K14I loci lack an env gene and are structurally reminiscent of LTR retrotransposons. A minority of HERV-K14I variants display an env gene. HERV-K14I proviruses are associated with three distinct LTR families, while HERV-K14CI is associated with a single LTR family. Hybrid proviruses consisting of HERV-K14I and HERV-W sequences that appear to have produced provirus progeny in the genome were detected. Several HERV-K14I proviruses harbor TRPC6 mRNA portions, exemplifying mobilization of cellular transcripts by HERVs. Our analysis contributes essential information on two more HERV families and on the biology of HERV sequences in general.


2008 ◽  
Vol 82 (17) ◽  
pp. 8762-8770 ◽  
Author(s):  
Young Nam Lee ◽  
Michael H. Malim ◽  
Paul D. Bieniasz

ABSTRACT Human endogenous retroviruses (HERVs) comprise approximately 8% of the human genome, but all are remnants of ancient retroviral infections and harbor inactivating mutations that render them replication defective. Nevertheless, as viral “fossils,” HERVs may provide insights into ancient retrovirus-host interactions and their evolution. Indeed, one endogenous retrovirus [HERV-K(HML-2)], which has replicated in humans for the past few million years but is now thought to be extinct, was recently reconstituted in a functional form, and infection assays based on it have been established. Here, we show that several human APOBEC3 proteins are intrinsically capable of mutating and inhibiting infection by HERV-K(HML-2) in cell culture. We also present striking evidence that two HERV-K(HML-2) proviruses that are fixed in the modern human genome (HERV-K60 and HERV-KI) were subjected to hypermutation by a cytidine deaminase. Inspection of the spectrum of mutations that are found in HERV-K proviruses in the human genome and HERV-K DNA generated during in vitro replication in the presence of each of the human APOBEC3 proteins unequivocally identifies APOBEC3G as the cytidine deaminase responsible for hypermutation of HERV-K60 and HERV-KI. This is a rare example of the antiretroviral effects of APOBEC3G in the setting of natural human infection, whose consequences have been fossilized in human DNA, and a striking example of inactivation of ancient retroviruses in humans through enzymatic cytidine deamination.


1999 ◽  
Vol 73 (2) ◽  
pp. 1175-1185 ◽  
Author(s):  
Jean-Luc Blond ◽  
Frédéric Besème ◽  
Laurent Duret ◽  
Olivier Bouton ◽  
Frédéric Bedin ◽  
...  

ABSTRACT The multiple sclerosis-associated retrovirus (MSRV) isolated from plasma of MS patients was found to be phylogenetically and experimentally related to human endogenous retroviruses (HERVs). To characterize the MSRV-related HERV family and to test the hypothesis of a replication-competent HERV, we have investigated the expression of MSRV-related sequences in healthy tissues. The expression of MSRV-related transcripts restricted to the placenta led to the isolation of overlapping cDNA clones from a cDNA library. These cDNAs spanned a 7.6-kb region containing gag, pol, and env genes; RU5 and U3R flanking sequences; a polypurine tract; and a primer binding site (PBS). As this PBS showed similarity to avian retrovirus PBSs used by tRNATrp, this new HERV family was named HERV-W. Several genomic elements were identified, one of them containing a complete HERV-W unit, spanning all cDNA clones. Elements of this multicopy family were not replication competent, asgag and pol open reading frames (ORFs) were interrupted by frameshifts and stop codons. A complete ORF putatively coding for an envelope protein was found both on the HERV-W DNA prototype and within an RU5-env-U3R polyadenylated cDNA clone. Placental expression of 8-, 3.1-, and 1.3-kb transcripts was observed, and a putative splicing strategy was described. The apparently tissue-restricted HERV-W long terminal repeat expression is discussed with respect to physiological and pathological contexts.


2019 ◽  
Vol 93 (16) ◽  
Author(s):  
Maria Paola Pisano ◽  
Nicole Grandi ◽  
Marta Cadeddu ◽  
Jonas Blomberg ◽  
Enzo Tramontano

ABSTRACTEight percent of the human genome is composed of human endogenous retroviruses (HERVs), remnants of ancestral germ line infections by exogenous retroviruses, which have been vertically transmitted as Mendelian characters. The HML-6 group, a member of the class II betaretrovirus-like viruses, includes several proviral loci with an increased transcriptional activity in cancer and at least two elements that are known for retaining an intact open reading frame and for encoding small proteins such as ERVK3-1, which is expressed in various healthy tissues, and HERV-K-MEL, a small Env peptide expressed in samples of cutaneous and ocular melanoma but not in normal tissues.IMPORTANCEWe reported the distribution and genetic composition of 66 HML-6 elements. We analyzed the phylogeny of the HML-6 sequences and identified two main clusters. We provided the first description of a Rec domain within theenvsequence of 23 HML-6 elements. A Rec domain was also predicted within the ERVK3-1 transcript sequence, revealing its expression in various healthy tissues. Evidence about the context of insertion and colocalization of 19 HML-6 elements with functional human genes are also reported, including the sequence 16p11.2, whose 5′ long terminal repeat overlapped the exon of one transcript variant of a cellular zinc finger upregulated and involved in hepatocellular carcinoma. The present work provides the first complete overview of the HML-6 elements in GRCh37(hg19), describing the structure, phylogeny, and genomic context of insertion of each locus. This information allows a better understanding of the genetics of one of the most expressed HERV groups in the human genome.


2019 ◽  
Author(s):  
Luke Anderson-Trocmé ◽  
Rick Farouni ◽  
Mathieu Bourgey ◽  
Yoichiro Kamatani ◽  
Koichiro Higasa ◽  
...  

AbstractRecent reports have identified differences in the mutational spectra across human populations. While some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data is used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower-quality data from the early phases of the 1kGP thus continues to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.


Sign in / Sign up

Export Citation Format

Share Document