scholarly journals Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains

2019 ◽  
Author(s):  
Mahoko Takahashi Ueda ◽  
Kirill Kryukov ◽  
Satomi Mitsuhashi ◽  
Hiroaki Mitsuhashi ◽  
Tadashi Imanishi ◽  
...  

AbstractEndogenous retroviruses (ERVs) are remnants of ancient retroviral infections of mammalian germline cells. A large proportion of ERVs lose their open reading frames (ORFs), while others retain them and become exapted by the host species. However, it remains unclear what proportion of ERVs possess ORFs (ERV-ORFs), become transcribed, and serve as candidates for co-opted genes. Hence, we investigated characteristics of 176,401 ERV-ORFs containing retroviral-like protein domains (gag, pro, pol, and env) in 19 mammalian genomes. The fractions of ERVs possessing ORFs were overall small (∼0.15%) although they varied depending on domain types as well as species. The observed divergence of ERV-ORF from their consensus sequences suggested that a large proportion of ERV-ORFs either recently or anciently inserted themselves into mammalian genomes. Alternatively, very few ERVs lacking ORFs were found to exhibit similar divergence patterns. To identify ERV-ORFs transcribed as proteins, we compared ERV-ORFs with various multi-omics data including transcriptome data, trimethylation at histone H3 lysine 36, and transcription initiation sites from 2,834 cell types, and found 408 and 752 ERV-ORFs, accounting for 2-3% of all ERV-ORFs, with high transcriptional potential in humans and mice, respectively. Moreover, many of these ERV-ORFs with transcriptional potential were lineage-specific sequences exhibiting tissue-specific expression. These results suggest a possibility for the expression of uncharacterized functional genes containing ERV-ORFs hidden within mammalian genomes. Together, our analyses suggest that more ERV-ORFs may be co-opted in a host-species specific manner than we currently know, which are likely to have contributed to mammalian evolution and diversification.

Mobile DNA ◽  
2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Mahoko Takahashi Ueda ◽  
Kirill Kryukov ◽  
Satomi Mitsuhashi ◽  
Hiroaki Mitsuhashi ◽  
Tadashi Imanishi ◽  
...  

Abstract Background Endogenous retroviruses (ERVs) are remnants of ancient retroviral infections of mammalian germline cells. A large proportion of ERVs lose their open reading frames (ORFs), while others retain them and become exapted by the host species. However, it remains unclear what proportion of ERVs possess ORFs (ERV-ORFs), become transcribed, and serve as candidates for co-opted genes. Results We investigated characteristics of 176,401 ERV-ORFs containing retroviral-like protein domains (gag, pro, pol, and env) in 19 mammalian genomes. The fractions of ERVs possessing ORFs were overall small (~ 0.15%) although they varied depending on domain types as well as species. The observed divergence of ERV-ORF from their consensus sequences showed bimodal distributions, suggesting that a large proportion of ERV-ORFs either recently, or anciently, inserted themselves into mammalian genomes. Alternatively, very few ERVs lacking ORFs were found to exhibit similar divergence patterns. To identify candidates for ERV-derived genes, we estimated the ratio of non-synonymous to synonymous substitution rates (dN/dS) for ERV-ORFs in human and non-human mammalian pairs, and found that approximately 42% of the ERV-ORFs showed dN/dS < 1. Further, using functional genomics data including transcriptome sequencing, we determined that approximately 9.7% of these selected ERV-ORFs exhibited transcriptional potential. Conclusions These results suggest that purifying selection operates on a certain portion of ERV-ORFs, some of which may correspond to uncharacterized functional genes hidden within mammalian genomes. Together, our analyses suggest that more ERV-ORFs may be co-opted in a host-species specific manner than we currently know, which are likely to have contributed to mammalian evolution and diversification.


2011 ◽  
Vol 286 (41) ◽  
pp. 35543-35552 ◽  
Author(s):  
Carla J. Cohen ◽  
Rita Rebollo ◽  
Sonja Babovic ◽  
Elizabeth L. Dai ◽  
Wendy P. Robinson ◽  
...  

The long terminal repeat (LTR) sequences of endogenous retroviruses and retroelements contain promoter elements and are known to form chimeric transcripts with nearby cellular genes. Here we show that an LTR of the THE1D retroelement family has been domesticated as an alternative promoter of human IL2RB, the gene encoding the β subunit of the IL-2 receptor. The LTR promoter confers expression specifically in the placental trophoblast as opposed to its native transcription in the hematopoietic system. Rather than sequence-specific determinants, DNA methylation was found to regulate transcription initiation and splicing efficiency in a tissue-specific manner. Furthermore, we detected the cytoplasmic signaling domain of the IL-2Rβ protein in the placenta, suggesting that IL-2Rβ undergoes preferential proteolytic cleavage in this tissue. These findings implicate novel functions for this cytokine receptor subunit in the villous trophoblast and reveal an intriguing example of ancient LTR exaptation to drive tissue-specific gene expression.


2009 ◽  
Vol 90 (2) ◽  
pp. 334-346 ◽  
Author(s):  
Alejandra Garcia-Maruniak ◽  
Adly M. M. Abd-Alla ◽  
Tamer Z. Salem ◽  
Andrew G. Parker ◽  
Verena-Ulrike Lietze ◽  
...  

Glossina pallidipes and Musca domestica salivary gland hypertrophy viruses (GpSGHV and MdSGHV) replicate in the nucleus of salivary gland cells causing distinct tissue hypertrophy and reduction of host fertility. They share general characteristics with the non-occluded insect nudiviruses, such as being insect-pathogenic, having enveloped, rod-shaped virions, and large circular double-stranded DNA genomes. MdSGHV measures 65×550 nm and contains a 124 279 bp genome (∼44 mol% G+C content) that codes for 108 putative open reading frames (ORFs). GpSGHV, measuring 50×1000 nm, contains a 190 032 bp genome (28 mol% G+C content) with 160 putative ORFs. Comparative genomic analysis demonstrates that 37 MdSGHV ORFs have homology to 42 GpSGHV ORFs, as some MdSGHV ORFs have homology to two different GpSGHV ORFs. Nine genes with known functions (dnapol, ts, pif-1, pif-2, pif-3, mmp, p74, odv-e66 and helicase-2), a homologue of the conserved baculovirus gene Ac81 and at least 13 virion proteins are present in both SGHVs. The amino acid identity ranged from 19 to 39 % among ORFs. An (A/T/G)TAAG motif, similar to the baculovirus late promoter motif, was enriched 100 bp upstream of the ORF transcription initiation sites of both viruses. Six and seven putative microRNA sequences were found in MdSGHV and GpSGHV genomes, respectively. There was genome. Collinearity between the two SGHVs, but not between the SGHVs and the nudiviruses. Phylogenetic analysis of conserved genes clustered both SGHVs in a single clade separated from the nudiviruses and baculoviruses. Although MdSGHV and GpSGHV are different viruses, their pathology, host range and genome composition indicate that they are related.


2019 ◽  
Author(s):  
Adam G Diehl ◽  
Ningxin Ouyang ◽  
Alan P Boyle

AbstractBackgroundChromatin looping is exceedingly important to gene regulation and a host of other nuclear processes. Many recent insights into 3D chromatin structure across species and cell types have contributed to our understanding of the principles governing chromatin looping. However, 3D genome evolution and how it relates to Mendelian selection remain largely unexplored. CTCF, an insulator protein found at most loop anchors, has been described as the “master weaver” of mammalian genomes, and variations in CTCF occupancy are known to influence looping divergence. A large fraction of mammalian CTCF binding sites fall within transposable elements (TEs) but their contributions to looping variation are unknown. Here we investigated the effect of TE-driven CTCF binding site expansions on chromatin looping in human and mouse.ResultsTEs have broadly contributed to CTCF binding and loop boundary specification, primarily forming variable loops across species and cell types and contributing nearly 1/3 of species-specific and cell-specific loops.ConclusionsOur results demonstrate that TE activity is a major source of looping variability across species and cell types. Thus, TE-mediated CTCF expansions explain a large fraction of population-level looping variation and may play a role in adaptive evolution.


2013 ◽  
Vol 79 (12) ◽  
pp. 3724-3733 ◽  
Author(s):  
Frank O. Aylward ◽  
Bradon R. McDonald ◽  
Sandra M. Adams ◽  
Alejandra Valenzuela ◽  
Rebeccah A. Schmidt ◽  
...  

ABSTRACTSphingomonads comprise a physiologically versatile group within theAlphaproteobacteriathat includes strains of interest for biotechnology, human health, and environmental nutrient cycling. In this study, we compared 26 sphingomonad genome sequences to gain insight into their ecology, metabolic versatility, and environmental adaptations. Our multilocus phylogenetic and average amino acid identity (AAI) analyses confirm thatSphingomonas,Sphingobium,Sphingopyxis, andNovosphingobiumare well-resolved monophyletic groups with the exception ofSphingomonassp. strain SKA58, which we propose belongs to the genusSphingobium. Our pan-genomic analysis of sphingomonads reveals numerous species-specific open reading frames (ORFs) but few signatures of genus-specific cores. The organization and coding potential of the sphingomonad genomes appear to be highly variable, and plasmid-mediated gene transfer and chromosome-plasmid recombination, together with prophage- and transposon-mediated rearrangements, appear to play prominent roles in the genome evolution of this group. We find that many of the sphingomonad genomes encode numerous oxygenases and glycoside hydrolases, which are likely responsible for their ability to degrade various recalcitrant aromatic compounds and polysaccharides, respectively. Many of these enzymes are encoded on megaplasmids, suggesting that they may be readily transferred between species. We also identified enzymes putatively used for the catabolism of sulfonate and nitroaromatic compounds in many of the genomes, suggesting that plant-based compounds or chemical contaminants may be sources of nitrogen and sulfur. Many of these sphingomonads appear to be adapted to oligotrophic environments, but several contain genomic features indicative of host associations. Our work provides a basis for understanding the ecological strategies employed by sphingomonads and their role in environmental nutrient cycling.


1986 ◽  
Vol 6 (6) ◽  
pp. 2149-2157 ◽  
Author(s):  
A Heguy ◽  
A West ◽  
R I Richards ◽  
M Karin

The human metallothionein (MT) IB gene (hMT-IB) is located in a region of human DNA containing at least four tandemly arranged MT genes. As deduced from its sequence, hMT-IB is likely to encode a functional protein. However, the predicted amino acid sequence differed from the hMT-I amino acid sequence in four positions. Most remarkable was the presence of an additional cysteine. Like other MT genes, hMT-IB has at least two copies of the metal-responsive element upstream from the transcription initiation site. These elements probably are responsible for the metal responsiveness of the hMT-IB promoter, leading to inducible expression of fused heterologous genes. Unlike the hMT-IIA and hMT-IA genes described previously, which are expressed in many different cell types, a high level of expression of the endogenous hMT-IB gene could be detected only in human hepatoma and renal carcinoma cell lines. Therefore, this is the first MT gene described which exhibits tissue specificity of expression. This specificity is controlled by a cis-acting mechanism involving methylation, since incubation of nonexpressing cells with an inhibitor of DNA methylation led to activation of the hMT-IB gene. In support of this notion, we found that the 5' flanking region of the hMT-IB gene was highly methylated in HeLa cells, a nonexpressing cell type, but it was not methylated in a hepatoma (expressing) cell line.


1986 ◽  
Vol 6 (6) ◽  
pp. 2149-2157
Author(s):  
A Heguy ◽  
A West ◽  
R I Richards ◽  
M Karin

The human metallothionein (MT) IB gene (hMT-IB) is located in a region of human DNA containing at least four tandemly arranged MT genes. As deduced from its sequence, hMT-IB is likely to encode a functional protein. However, the predicted amino acid sequence differed from the hMT-I amino acid sequence in four positions. Most remarkable was the presence of an additional cysteine. Like other MT genes, hMT-IB has at least two copies of the metal-responsive element upstream from the transcription initiation site. These elements probably are responsible for the metal responsiveness of the hMT-IB promoter, leading to inducible expression of fused heterologous genes. Unlike the hMT-IIA and hMT-IA genes described previously, which are expressed in many different cell types, a high level of expression of the endogenous hMT-IB gene could be detected only in human hepatoma and renal carcinoma cell lines. Therefore, this is the first MT gene described which exhibits tissue specificity of expression. This specificity is controlled by a cis-acting mechanism involving methylation, since incubation of nonexpressing cells with an inhibitor of DNA methylation led to activation of the hMT-IB gene. In support of this notion, we found that the 5' flanking region of the hMT-IB gene was highly methylated in HeLa cells, a nonexpressing cell type, but it was not methylated in a hepatoma (expressing) cell line.


2021 ◽  
Author(s):  
Jennifer L. Houtz ◽  
Jon G. Sanders ◽  
Anthony Denice ◽  
Andrew H. Moeller

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Laura Santini ◽  
Florian Halbritter ◽  
Fabian Titz-Teixeira ◽  
Toru Suzuki ◽  
Maki Asami ◽  
...  

AbstractIn mammalian genomes, differentially methylated regions (DMRs) and histone marks including trimethylation of histone 3 lysine 27 (H3K27me3) at imprinted genes are asymmetrically inherited to control parentally-biased gene expression. However, neither parent-of-origin-specific transcription nor imprints have been comprehensively mapped at the blastocyst stage of preimplantation development. Here, we address this by integrating transcriptomic and epigenomic approaches in mouse preimplantation embryos. We find that seventy-one genes exhibit previously unreported parent-of-origin-specific expression in blastocysts (nBiX: novel blastocyst-imprinted expressed). Uniparental expression of nBiX genes disappears soon after implantation. Micro-whole-genome bisulfite sequencing (µWGBS) of individual uniparental blastocysts detects 859 DMRs. We further find that 16% of nBiX genes are associated with a DMR, whereas most are associated with parentally-biased H3K27me3, suggesting a role for Polycomb-mediated imprinting in blastocysts. nBiX genes are clustered: five clusters contained at least one published imprinted gene, and five clusters exclusively contained nBiX genes. These data suggest that early development undergoes a complex program of stage-specific imprinting involving different tiers of regulation.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Ruklanthi de Alwis ◽  
Li Liang ◽  
Omid Taghavian ◽  
Emma Werner ◽  
Hao Chung The ◽  
...  

Abstract Background Shigella is a major diarrheal pathogen for which there is presently no vaccine. Whole genome sequencing provides the ability to predict and derive novel antigens for use as vaccines. Here, we aimed to identify novel immunogenic Shigella antigens that could serve as Shigella vaccine candidates, either alone, or when conjugated to Shigella O-antigen. Methods Using a reverse vaccinology approach, where genomic analysis informed the Shigella immunome via an antigen microarray, we aimed to identify novel immunogenic Shigella antigens. A core genome analysis of Shigella species, pathogenic and non-pathogenic Escherichia coli, led to the selection of 234 predicted immunogenic Shigella antigens. These antigens were expressed and probed with acute and convalescent serum from microbiologically confirmed Shigella infections. Results Several Shigella antigens displayed IgG and IgA seroconversion, with no difference in sero-reactivity across by sex or age. IgG sero-reactivity to key Shigella antigens was observed at birth, indicating transplacental antibody transfer. Six antigens (FepA, EmrK, FhuA, MdtA, NlpB, and CjrA) were identified in in vivo testing as capable of producing binding IgG and complement-mediated bactericidal antibody. Conclusions These findings provide six novel immunogenic Shigella proteins that could serve as candidate vaccine antigens, species-specific carrier proteins, or targeted adjuvants.


Sign in / Sign up

Export Citation Format

Share Document