scholarly journals A de novo evolved gene in the house mouse regulates female pregnancy cycles

eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Chen Xie ◽  
Cemalettin Bekpen ◽  
Sven Künzel ◽  
Maryam Keshavarz ◽  
Rebecca Krebs-Wheaton ◽  
...  

The de novo emergence of new genes has been well documented through genomic analyses. However, a functional analysis, especially of very young protein-coding genes, is still largely lacking. Here, we identify a set of house mouse-specific protein-coding genes and assess their translation by ribosome profiling and mass spectrometry data. We functionally analyze one of them, Gm13030, which is specifically expressed in females in the oviduct. The interruption of the reading frame affects the transcriptional network in the oviducts at a specific stage of the estrous cycle. This includes the upregulation of Dcpp genes, which are known to stimulate the growth of preimplantation embryos. As a consequence, knockout females have their second litters after shorter times and have a higher infanticide rate. Given that Gm13030 shows no signs of positive selection, our findings support the hypothesis that a de novo evolved gene can directly adopt a function without much sequence adaptation.

2015 ◽  
Author(s):  
Lorenzo Calviello ◽  
Neelanjan Mukherjee ◽  
Emanuel Wyler ◽  
Henrik Zauber ◽  
Antje Hirsekorn ◽  
...  

RNA sequencing protocols allow for quantifying gene expression regulation at each individual step, from transcription to protein synthesis. Ribosome Profiling (Ribo-seq) maps the positions of translating ribosomes over the entire transcriptome. Despite its great potential, a rigorous statistical approach to identify translated regions by means of the characteristic three-nucleotide periodicity of Ribo-seq data is not yet available. To fill this gap, we developed RiboTaper, which quantifies the significance of periodic Ribo-seq reads via spectral analysis methods. We applied RiboTaper on newly generated, deep Ribo-seq data in HEK293 cells, to derive an extensive map of translation that covers Open Reading Frame (ORF) annotations for more than 11,000 protein- coding genes. We also find distinct ribosomal signatures for several hundred detected upstream ORFs and ORFs in annotated non-coding genes (ncORFs). Mass spectrometry data confirms that RiboTaper achieves excellent coverage of the cellular proteome and validates dozens of novel peptide products. Collectively, RiboTaper (available at https://ohlerlab.mdc-berlin.de/software/ ) is a powerful method for comprehensive de novo identification of actively used ORFs in the human genome.


2020 ◽  
Author(s):  
G Loughran ◽  
AV Zhdanov ◽  
MS Mikhaylova ◽  
FN Rozov ◽  
PN Datskevich ◽  
...  

AbstractWhile near cognate codons are frequently used for translation initiation in eukaryotes, their efficiencies are usually low (<10% compared to an AUG in optimal context). Here we describe a rare case of highly efficient near cognate initiation. A CUG triplet located in the 5’ leader of POLG mRNA initiates almost as efficiently (~60-70%) as an AUG in optimal context. This CUG directs translation of a conserved 260 triplet-long overlapping ORF, which we call POLGARF (POLGAlternative Reading Frame). Translation of a short upstream ORF 5’ of this CUG governs the ratio between DNA polymerase and POLGARF produced from a single POLG mRNA. Functional investigation of POLGARF points to extracellular signalling. While unprocessed POLGARF resides in the nucleoli together with its interacting partner C1QBP, serum stimulation results in rapid secretion of POLGARF C-terminal fragment. Phylogenetic analysis shows that POLGARF evolved ~160 million years ago due to an MIR transposition into the 5’ leader sequence of the mammalian POLG gene which became fixed in placental mammals. The discovery of POLGARF unveils a previously undescribed mechanism of de novo protein-coding gene evolution.Significance StatementIn this study, we describe previously unknown mechanism of de novo protein-coding gene evolution. We show that the POLG gene, which encodes the catalytic subunit of mitochondrial DNA polymerase, is in fact a dual coding gene. Ribosome profiling, phylogenetic conservation, and reporter construct analyses all demonstrate that POLG mRNA possesses a conserved CUG codon which serves as a start of translation for an exceptionally long overlapping open reading frame (260 codons in human) present in all placental mammals. We called the protein encoded in this alternative reading frame POLGARF. We provide evidence that the evolution of POLGARF was incepted upon insertion of an MIR transposable element of the SINE family.


2019 ◽  
Vol 37 (4) ◽  
pp. 1148-1164
Author(s):  
Liam Abrahams ◽  
Laurence D Hurst

Abstract Although the constraints on a gene’s sequence are often assumed to reflect the functioning of that gene, here we propose transfer selection, a constraint operating on one class of genes transferred to another, mediated by shared binding factors. We show that such transfer can explain an otherwise paradoxical depletion of stop codons in long intergenic noncoding RNAs (lincRNAs). Serine/arginine-rich proteins direct the splicing machinery by binding exonic splice enhancers (ESEs) in immature mRNA. As coding exons cannot contain stop codons in one reading frame, stop codons should be rare within ESEs. We confirm that the stop codon density (SCD) in ESE motifs is low, even accounting for nucleotide biases. Given that serine/arginine-rich proteins binding ESEs also facilitate lincRNA splicing, a low SCD could transfer to lincRNAs. As predicted, multiexon lincRNA exons are depleted in stop codons, a result not explained by open reading frame (ORF) contamination. Consistent with transfer selection, stop codon depletion in lincRNAs is most acute in exonic regions with the highest ESE density, disappears when ESEs are masked, is consistent with stop codon usage skews in ESEs, and is diminished in both single-exon lincRNAs and introns. Owing to low SCD, the maximum lengths of pseudo-ORFs frequently exceed null expectations. This has implications for ORF annotation and the evolution of de novo protein-coding genes from lincRNAs. We conclude that not all constraints operating on genes need be explained by the functioning of the gene but may instead be transferred owing to shared binding factors.


2019 ◽  
Author(s):  
Chen Xie ◽  
Cemalettin Bekpen ◽  
Sven Künzel ◽  
Maryam Keshavarz ◽  
Rebecca Krebs-Wheaton ◽  
...  

AbstractThe de novo emergence of new transcripts has been well documented through genomic analyses. However, a functional analysis, especially of very young protein-coding genes, is still largely lacking. Here we focus on three loci that have evolved from previously intergenic sequences in the house mouse (Mus musculus) and are not present in its closest relatives. We have obtained knockouts and analyzed their phenotypes, including a deep transcriptomic analysis, based on a dedicated power analysis. We show that the transcriptional networks are significantly disturbed in the knockouts and that all three genes have effects on phenotypes that are related to their expression patterns. This includes behavioral effects, skeletal differences and the regulation of the reproduction cycle in females. Substitution analysis suggests that all three genes have directly obtained an activity, without new adaptive substitutions. Our findings support the hypothesis that de novo genes can quickly adopt functions without extensive adaptation.Impact statementNew protein-coding genes emerging out of non-coding sequences can become directly functional without signatures of adaptive protein changes


2017 ◽  
Author(s):  
Zhengtao Xiao ◽  
Rongyao Huang ◽  
Yuling Chen ◽  
Haiteng Deng ◽  
Xuerui Yang

AbstractBy capturing and sequencing the RNA fragments protected by translating ribosomes, ribosome profiling sketches the landscape of translation at subcodon resolution. We developed a new method, RiboCode, which uses ribosome profiling data to assess the translation of each RNA transcript genome-wide. As shown by multiple tests with simulated data and cell type-specific QTI-seq and mass spectrometry data, RiboCode exhibits superior efficiency, sensitivity, and accuracy for de novo annotation of the translatome, which covers various types of novel ORFs in the previously annotated coding and non-coding regions and overlapping ORFs. Finally, to showcase its application, we applied RiboCode on a published ribosome profiling dataset and assembled the context-dependent translatomes of yeast under normal condition, heat shock, and oxidative stress. Comparisons among these translatomes revealed stress-activated novel upstream and downstream ORFs, some of which are associated with potential translational dysregulations of the main protein coding ORFs in response to the stress signals.


2015 ◽  
Vol 370 (1678) ◽  
pp. 20140332 ◽  
Author(s):  
Aoife McLysaght ◽  
Daniele Guerzoni

The origin of novel protein-coding genes de novo was once considered so improbable as to be impossible. In less than a decade, and especially in the last five years, this view has been overturned by extensive evidence from diverse eukaryotic lineages. There is now evidence that this mechanism has contributed a significant number of genes to genomes of organisms as diverse as Saccharomyces , Drosophila , Plasmodium , Arabidopisis and human. From simple beginnings, these genes have in some instances acquired complex structure, regulated expression and important functional roles. New genes are often thought of as dispensable late additions; however, some recent de novo genes in human can play a role in disease. Rather than an extremely rare occurrence, it is now evident that there is a relatively constant trickle of proto-genes released into the testing ground of natural selection. It is currently unknown whether de novo genes arise primarily through an ‘RNA-first’ or ‘ORF-first’ pathway. Either way, evolutionary tinkering with this pool of genetic potential may have been a significant player in the origins of lineage-specific traits and adaptations.


2019 ◽  
Author(s):  
Ryan Bracewell ◽  
Anita Tran ◽  
Kamalakar Chatla ◽  
Doris Bachtrog

ABSTRACTThe Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromere, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.


2021 ◽  
Author(s):  
VISHNU PRASOODANAN P K ◽  
Shruti S. Menon ◽  
Rituja Saxena ◽  
Prashant Waiker ◽  
Vineet K Sharma

Discovery of novel thermophiles has shown promising applications in the field of biotechnology. Due to their thermal stability, they can survive the harsh processes in the industries, which make them important to be characterized and studied. Members of Anoxybacillus are alkaline tolerant thermophiles and have been extensively isolated from manure, dairy-processed plants, and geothermal hot springs. This article reports the assembled data of an aerobic bacterium Anoxybacillus sp. strain MB8, isolated from the Tattapani hot springs in Central India, where the 16S rRNA gene shares an identity of 97% (99% coverage) with Anoxybacillus kamchatkensis strain G10. The de novo assembly and annotation performed on the genome of Anoxybacillus sp. strain MB8 comprises of 2,898,780 bp (in 190 contigs) with a GC content of 41.8% and includes 2,976 protein-coding genes,1 rRNA operon, 73 tRNAs, 1 tm-RNA and 10 CRISPR arrays. The predicted protein-coding genes have been classified into 21 eggNOG categories. The KEGG Automated Annotation Server (KAAS) analysis indicated the presence of assimilatory sulfate reduction pathway, nitrate reducing pathway, and genes for glycoside hydrolases (GHs) and glycoside transferase (GTs). GHs and GTs hold widespread applications, in the baking and food industry for bread manufacturing, and in the paper, detergent and cosmetic industry. Hence, Anoxybacillus sp. strain MB8 holds the potential to be screened and characterized for such commercially relevant enzymes.


Sign in / Sign up

Export Citation Format

Share Document