Tracing the De Novo Origin of Protein-Coding Genes in Yeast

ABSTRACT De novo genes are very important for evolutionary innovation. However, how these genes originate and spread remains largely unknown. To better understand this, we rigorously searched for de novo genes in Saccharomyces cerevisiae S288C and examined their spread and fixation in the population. Here, we identified 84 de novo genes in S. cerevisiae S288C since the divergence with their sister groups. Transcriptome and ribosome profiling data revealed at least 8 (10%) and 28 (33%) de novo genes being expressed and translated only under specific conditions, respectively. DNA microarray data, based on 2-fold change, showed that 87% of the de novo genes are regulated during various biological processes, such as nutrient utilization and sporulation. Our comparative and evolutionary analyses further revealed that some factors, including single nucleotide polymorphism (SNP)/indel mutation, high GC content, and DNA shuffling, contribute to the birth of de novo genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we also provide evidence suggesting the possible parallel origin of a de novo gene between S. cerevisiae and Saccharomyces paradoxus. Together, our study provides several new insights into the origin and spread of de novo genes. IMPORTANCE Emergence of de novo genes has occurred in many lineages during evolution, but the birth, spread, and function of these genes remain unresolved. Here we have searched for de novo genes from Saccharomyces cerevisiae S288C using rigorous methods, which reduced the effects of bad annotation and genomic gaps on the identification of de novo genes. Through this analysis, we have found 84 new genes originating de novo from previously noncoding regions, 87% of which are very likely involved in various biological processes. We noticed that 10% and 33% of de novo genes were only expressed and translated under specific conditions, therefore, verification of de novo genes through transcriptome and ribosome profiling, especially from limited expression data, may underestimate the number of bona fide new genes. We further show that SNP/indel mutation, high GC content, and DNA shuffling could be involved in the birth of de novo genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we provide evidence suggesting the possible parallel origin of a new gene.

Download Full-text

Exploiting the Diversity of Saccharomycotina Yeasts To Engineer Biotin-Independent Growth of Saccharomyces cerevisiae

Applied and Environmental Microbiology ◽

10.1128/aem.00270-20 ◽

2020 ◽

Vol 86 (12) ◽

Cited By ~ 1

Author(s):

Anna K. Wronska ◽

Meinske P. Haak ◽

Ellen Geraats ◽

Eva Bruins Slot ◽

Marcel van den Broek ◽

...

Keyword(s):

Saccharomyces Cerevisiae ◽

Large Scale ◽

De Novo ◽

Laboratory Strain ◽

Optimal Growth ◽

Industrial Applications ◽

Fast Growth ◽

Growth Media ◽

Free Medium ◽

Content Type

ABSTRACT Biotin, an important cofactor for carboxylases, is essential for all kingdoms of life. Since native biotin synthesis does not always suffice for fast growth and product formation, microbial cultivation in research and industry often requires supplementation of biotin. De novo biotin biosynthesis in yeasts is not fully understood, which hinders attempts to optimize the pathway in these industrially relevant microorganisms. Previous work based on laboratory evolution of Saccharomyces cerevisiae for biotin prototrophy identified Bio1, whose catalytic function remains unresolved, as a bottleneck in biotin synthesis. This study aimed at eliminating this bottleneck in the S. cerevisiae laboratory strain CEN.PK113-7D. A screening of 35 Saccharomycotina yeasts identified six species that grew fast without biotin supplementation. Overexpression of the S. cerevisiae BIO1 (ScBIO1) ortholog isolated from one of these biotin prototrophs, Cyberlindnera fabianii, enabled fast growth of strain CEN.PK113-7D in biotin-free medium. Similar results were obtained by single overexpression of C. fabianii BIO1 (CfBIO1) in other laboratory and industrial S. cerevisiae strains. However, biotin prototrophy was restricted to aerobic conditions, probably reflecting the involvement of oxygen in the reaction catalyzed by the putative oxidoreductase CfBio1. In aerobic cultures on biotin-free medium, S. cerevisiae strains expressing CfBio1 showed a decreased susceptibility to contamination by biotin-auxotrophic S. cerevisiae. This study illustrates how the vast Saccharomycotina genomic resources may be used to improve physiological characteristics of industrially relevant S. cerevisiae. IMPORTANCE The reported metabolic engineering strategy to enable optimal growth in the absence of biotin is of direct relevance for large-scale industrial applications of S. cerevisiae. Important benefits of biotin prototrophy include cost reduction during the preparation of chemically defined industrial growth media as well as a lower susceptibility of biotin-prototrophic strains to contamination by auxotrophic microorganisms. The observed oxygen dependency of biotin synthesis by the engineered strains is relevant for further studies on the elucidation of fungal biotin biosynthesis pathways.

Download Full-text

A de novo evolved gene in the house mouse regulates female pregnancy cycles

eLife ◽

10.7554/elife.44392 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 4

Author(s):

Chen Xie ◽

Cemalettin Bekpen ◽

Sven Künzel ◽

Maryam Keshavarz ◽

Rebecca Krebs-Wheaton ◽

...

Keyword(s):

House Mouse ◽

De Novo ◽

Specific Protein ◽

Ribosome Profiling ◽

Mass Spectrometry Data ◽

Preimplantation Embryos ◽

Protein Coding ◽

Reading Frame ◽

Protein Coding Genes ◽

New Genes

The de novo emergence of new genes has been well documented through genomic analyses. However, a functional analysis, especially of very young protein-coding genes, is still largely lacking. Here, we identify a set of house mouse-specific protein-coding genes and assess their translation by ribosome profiling and mass spectrometry data. We functionally analyze one of them, Gm13030, which is specifically expressed in females in the oviduct. The interruption of the reading frame affects the transcriptional network in the oviducts at a specific stage of the estrous cycle. This includes the upregulation of Dcpp genes, which are known to stimulate the growth of preimplantation embryos. As a consequence, knockout females have their second litters after shorter times and have a higher infanticide rate. Given that Gm13030 shows no signs of positive selection, our findings support the hypothesis that a de novo evolved gene can directly adopt a function without much sequence adaptation.

Download Full-text

Leveraging Genetic-Background Effects in Saccharomyces cerevisiae To Improve Lignocellulosic Hydrolysate Tolerance

Applied and Environmental Microbiology ◽

10.1128/aem.01603-16 ◽

2016 ◽

Vol 82 (19) ◽

pp. 5838-5849 ◽

Cited By ~ 13

Author(s):

Maria Sardi ◽

Nikolay Rovinskiy ◽

Yaoping Zhang ◽

Audrey P. Gasch

Keyword(s):

Saccharomyces Cerevisiae ◽

Stress Tolerance ◽

Genetic Background ◽

Biofuel Production ◽

Gene Overexpression ◽

Lignocellulosic Hydrolysate ◽

Content Type ◽

Cellular Targets ◽

Specific Effects ◽

New Genes

ABSTRACTA major obstacle to sustainable lignocellulosic biofuel production is microbe inhibition by the combinatorial stresses in pretreated plant hydrolysate. Chemical biomass pretreatment releases a suite of toxins that interact with other stressors, including high osmolarity and temperature, which together can have poorly understood synergistic effects on cells. Improving tolerance in industrial strains has been hindered, in part because the mechanisms of tolerance reported in the literature often fail to recapitulate in other strain backgrounds. Here, we explored and then exploited variations in stress tolerance, toxin-induced transcriptomic responses, and fitness effects of gene overexpression in differentSaccharomyces cerevisiae(yeast) strains to identify genes and processes linked to tolerance of hydrolysate stressors. Using six differentS. cerevisiaestrains that together maximized phenotypic and genetic diversity, first we explored transcriptomic differences between resistant and sensitive strains to identify common and strain-specific responses. This comparative analysis implicated primary cellular targets of hydrolysate toxins, secondary effects of defective defense strategies, and mechanisms of tolerance. Dissecting the responses to individual hydrolysate components across strains pointed to synergistic interactions between osmolarity, pH, hydrolysate toxins, and nutrient composition. By characterizing the effects of high-copy gene overexpression in three different strains, we revealed the breadth of the background-specific effects of gene fitness contributions in synthetic hydrolysate. Our approach identified new genes for engineering improved stress tolerance in diverse strains while illuminating the effects of genetic background on molecular mechanisms.IMPORTANCERecent studies on natural variation withinSaccharomyces cerevisiaehave uncovered substantial phenotypic diversity. Here, we took advantage of this diversity, using it as a tool to infer the effects of combinatorial stress found in lignocellulosic hydrolysate. By comparing sensitive and tolerant strains, we implicated primary cellular targets of hydrolysate toxins and elucidated the physiological states of cells when exposed to this stress. We also explored the strain-specific effects of gene overexpression to further identify strain-specific responses to hydrolysate stresses and to identify genes that improve hydrolysate tolerance independent of strain background. This study underscores the importance of studying multiple strains to understand the effects of hydrolysate stress and provides a method to find genes that improve tolerance across strain backgrounds.

Download Full-text

De Novo Assembly of the Pneumocystis jirovecii Genome from a Single Bronchoalveolar Lavage Fluid Specimen from a Patient

mBio ◽

10.1128/mbio.00428-12 ◽

2012 ◽

Vol 4 (1) ◽

Cited By ~ 68

Author(s):

Ousmane H. Cissé ◽

Marco Pagni ◽

Philippe M. Hauser

Keyword(s):

Bronchoalveolar Lavage ◽

Genome Sequence ◽

Bronchoalveolar Lavage Fluid ◽

De Novo ◽

Gc Content ◽

Lavage Fluid ◽

New Drugs ◽

Pneumocystis Jirovecii ◽

Content Type

ABSTRACTPneumocystis jiroveciiis a fungus that causes severe pneumonia in immunocompromised patients. However, its study is hindered by the lack of anin vitroculture method. We report here the genome ofP. jiroveciithat was obtained from a single bronchoalveolar lavage fluid specimen from a patient. The major challenge was thein silicosorting of the reads from a mixture representing the different organisms of the lung microbiome. This genome lacks virulence factors and most amino acid biosynthesis enzymes and presents reduced GC content and size. Together with epidemiological observations, these features suggest thatP. jiroveciiis an obligate parasite specialized in the colonization of human lungs, which causes disease only in immune-deficient individuals. This genome sequence will boost research on this deadly pathogen.IMPORTANCEPneumocystispneumonia is a major cause of mortality in patients with impaired immune systems. The availability of theP. jiroveciigenome sequence allows new analyses to be performed which open avenues to solve critical issues for this deadly human disease. The most important ones are (i) identification of nutritional supplements for development of culturein vitro, which is still lacking 100 years after discovery of the pathogen; (ii) identification of new targets for development of new drugs, given the paucity of present treatments and emerging resistance; and (iii) identification of targets for development of vaccines.

Download Full-text

Secretion of Quinolinic Acid, an Intermediate in the Kynurenine Pathway, for Utilization in NAD + Biosynthesis in the Yeast Saccharomyces cerevisiae

Eukaryotic Cell ◽

10.1128/ec.00339-12 ◽

2013 ◽

Vol 12 (5) ◽

pp. 648-653 ◽

Cited By ~ 24

Author(s):

Kazuto Ohashi ◽

Shigeyuki Kawai ◽

Kousaku Murata

Keyword(s):

Saccharomyces Cerevisiae ◽

Nicotinic Acid ◽

Quinolinic Acid ◽

De Novo ◽

Kynurenine Pathway ◽

Salvage Pathway ◽

Yeast Saccharomyces Cerevisiae ◽

Content Type ◽

High Affinity ◽

Transcriptional Induction

ABSTRACT NAD + is synthesized from tryptophan either via the kynurenine ( de novo ) pathway or via the salvage pathway by reutilizing intermediates such as nicotinic acid or nicotinamide ribose. Quinolinic acid is an intermediate in the kynurenine pathway. We have discovered that the budding yeast Saccharomyces cerevisiae secretes quinolinic acid into the medium and also utilizes extracellular quinolinic acid as a novel NAD + precursor. We provide evidence that extracellular quinolinic acid enters the cell via Tna1, a high-affinity nicotinic acid permease, and thereby helps to increase the intracellular concentration of NAD + . Transcription of genes involved in the kynurenine pathway and Tna1 was increased, responding to a low intracellular NAD + concentration, in cells bearing mutations of these genes; this transcriptional induction was suppressed by supplementation with quinolinic acid or nicotinic acid. Our data thus shed new light on the significance of quinolinic acid, which had previously been recognized only as an intermediate in the kynurenine pathway.

Download Full-text

The Dysferlin Domain-Only Protein, Spo73, Is Required for Prospore Membrane Extension in Saccharomyces cerevisiae

mSphere ◽

10.1128/msphere.00038-15 ◽

2015 ◽

Vol 1 (1) ◽

Cited By ~ 1

Author(s):

Yuuya Okumura ◽

Tsuyoshi S. Nakamura ◽

Takayuki Tanaka ◽

Ichiro Inoue ◽

Yasuyuki Suda ◽

...

Keyword(s):

Saccharomyces Cerevisiae ◽

De Novo ◽

Developmental Process ◽

Spore Wall ◽

Membrane Formation ◽

Content Type ◽

Prospore Membrane ◽

Proper Size ◽

Membrane Extension ◽

Extension Analysis

ABSTRACT Prospore membrane formation consists of de novo double-membrane formation, which occurs during the developmental process of sporulation in Saccharomyces cerevisiae. Membranes are formed into their proper size and shape, and thus, prospore membrane formation has been studied as a general model of membrane formation. We identified SPO73, previously shown to be required for spore wall formation, as an additional gene involved in prospore membrane extension. Genetic and cell biological analyses suggested that Spo73 functions on the prospore membrane with other factors in prospore membrane extension, counteracting the bending force of the prospore membrane. Spo73 is the first dysferlin domain-only protein ever analyzed. The dysferlin domain is conserved from yeast to mammals and is found in dysferlin proteins, which are involved in dysferlinopathy, although the precise function of the domain is unknown. Continued analysis of Spo73 will contribute to our understanding of the function of dysferlin domains and dysferlinopathy. Sporulation of Saccharomyces cerevisiae is a developmental process in which an ascus containing four haploid spores forms from a diploid cell. During this process, newly formed membrane structures called prospore membranes extend along the nuclear envelope and engulf and package daughter nuclei along with cytosol and organelles to form precursors of spores. Proteins involved in prospore membrane extension, Vps13 and Spo71, have recently been reported; however, the overall mechanism of membrane extension remains unclear. Here, we identified Spo73 as an additional factor involved in prospore membrane extension. Analysis of a spo73∆ mutant revealed that it shows defects similar to those of a spo71∆ mutant during prospore membrane formation. Spo73 localizes to the prospore membrane, and this localization is independent of Spo71 and Vps13. In contrast, a Spo73 protein carrying mutations in a surface basic patch mislocalizes to the cytoplasm and overexpression of Spo71 can partially rescue localization to the prospore membrane. Similar to spo71∆ mutants, spo73∆ mutants display genetic interactions with the mutations in the SMA2 and SPO1 genes involved in prospore membrane bending. Further, our bioinformatic analysis revealed that Spo73 is a dysferlin domain-only protein. Thus, these results suggest that a dysferlin domain-only protein, Spo73, functions with a dual pleckstrin homology domain protein, Spo71, in prospore membrane extension. Analysis of Spo73 will provide insights into the conserved function of dysferlin domains, which is related to dysferlinopathy. IMPORTANCE Prospore membrane formation consists of de novo double-membrane formation, which occurs during the developmental process of sporulation in Saccharomyces cerevisiae. Membranes are formed into their proper size and shape, and thus, prospore membrane formation has been studied as a general model of membrane formation. We identified SPO73, previously shown to be required for spore wall formation, as an additional gene involved in prospore membrane extension. Genetic and cell biological analyses suggested that Spo73 functions on the prospore membrane with other factors in prospore membrane extension, counteracting the bending force of the prospore membrane. Spo73 is the first dysferlin domain-only protein ever analyzed. The dysferlin domain is conserved from yeast to mammals and is found in dysferlin proteins, which are involved in dysferlinopathy, although the precise function of the domain is unknown. Continued analysis of Spo73 will contribute to our understanding of the function of dysferlin domains and dysferlinopathy.

Download Full-text

Deep sequencing of natural and experimental populations of Drosophila melanogaster reveals biases in the spectrum of new mutations

10.1101/095182 ◽

2016 ◽

Author(s):

Zoe June Assaf ◽

Susanne Tilk ◽

Jane Park ◽

Mark L. Siegal ◽

Dmitri A. Petrov

Keyword(s):

Drosophila Melanogaster ◽

Natural Selection ◽

De Novo ◽

Natural Populations ◽

Gc Content ◽

Mutation Accumulation ◽

Sequencing Error ◽

Low Frequencies ◽

Nucleotide Mutation ◽

New Mutations

AbstractMutations provide the raw material of evolution, and thus our ability to study evolution depends fundamentally on whether we have precise measurements of mutational rates and patterns. Here we explore the rates and patterns of mutations using i) de novo mutations from Drosophila melanogaster mutation accumulation lines and ii) polymorphisms segregating at extremely low frequencies. The first, mutation accumulation (MA) lines, are the product of maintaining flies in tiny populations for many generations, therefore rendering natural selection ineffective and allowing new mutations to accrue in the genome. In addition to generating a novel dataset of sequenced MA lines, we perform a meta-analysis of all published MA studies in D. melanogaster, which allows more precise estimates of mutational patterns across the genome. In the second half of this work, we identify polymorphisms segregating at extremely low frequencies using several publicly available population genomic data sets from natural populations of D. melanogaster. Extremely rare polymorphisms are difficult to detect with high confidence due to the problem of distinguishing them from sequencing error, however a dataset of true rare polymorphisms would allow the quantification of mutational patterns. This is due to the fact that rare polymorphisms, much like de novo mutations, are on average younger and also relatively unaffected by the filter of natural selection. We identify a high quality set of ~70,000 rare polymorphisms, fully validated with resequencing, and use this dataset to measure mutational patterns in the genome. This includes identifying a high rate of multi-nucleotide mutation events at both short (~5bp) and long (~1kb) genomic distances, showing that mutation drives GC content lower in already GC-poor regions, and finding that the context-dependency of the mutation spectrum predicts long-term evolutionary patterns at four-fold synonymous sites. We also show that de novo mutations from independent mutation accumulation experiments display similar patterns of single nucleotide mutation, and match well the patterns of mutation found in natural populations.

Download Full-text

Regulation of Amino Acid Transport in Saccharomyces cerevisiae

Microbiology and Molecular Biology Reviews ◽

10.1128/mmbr.00024-19 ◽

2019 ◽

Vol 83 (4) ◽

Cited By ~ 5

Author(s):

Frans Bianchi ◽

Joury S. van’t Klooster ◽

Stephanie J. Ruiz ◽

Bert Poolman

Keyword(s):

Saccharomyces Cerevisiae ◽

Amino Acids ◽

Amino Acid ◽

De Novo ◽

Amino Acid Transporters ◽

Content Type ◽

Growth And Survival ◽

Amino Acid Homeostasis ◽

Underlying Mechanisms ◽

Amino Acid Sensing

SUMMARY We review the mechanisms responsible for amino acid homeostasis in Saccharomyces cerevisiae and other fungi. Amino acid homeostasis is essential for cell growth and survival. Hence, the de novo synthesis reactions, metabolic conversions, and transport of amino acids are tightly regulated. Regulation varies from nitrogen pool sensing to control by individual amino acids and takes place at the gene (transcription), protein (posttranslational modification and allostery), and vesicle (trafficking and endocytosis) levels. The pools of amino acids are controlled via import, export, and compartmentalization. In yeast, the majority of the amino acid transporters belong to the APC (amino acid-polyamine-organocation) superfamily, and the proteins couple the uphill transport of amino acids to the electrochemical proton gradient. Although high-resolution structures of yeast amino acid transporters are not available, homology models have been successfully exploited to determine and engineer the catalytic and regulatory functions of the proteins. This has led to a further understanding of the underlying mechanisms of amino acid sensing and subsequent downregulation of transport. Advances in optical microscopy have revealed a new level of regulation of yeast amino acid transporters, which involves membrane domain partitioning. The significance and the interrelationships of the latest discoveries on amino acid homeostasis are put in context.

Download Full-text

Readthrough Errors Purge Deleterious Cryptic Sequences, Facilitating the Birth of Coding Sequences

Molecular Biology and Evolution ◽

10.1093/molbev/msaa046 ◽

2020 ◽

Vol 37 (6) ◽

pp. 1761-1774 ◽

Cited By ~ 2

Author(s):

Luke J Kosinski ◽

Joanna Masel

Keyword(s):

Saccharomyces Cerevisiae ◽

De Novo ◽

Stop Codon ◽

Spillover Effects ◽

Structural Disorder ◽

Ribosome Profiling ◽

Noncoding Dna ◽

Protein Coding ◽

Selection Hypothesis ◽

Noncoding Sequences

Abstract De novo protein-coding innovations sometimes emerge from ancestrally noncoding DNA, despite the expectation that translating random sequences is overwhelmingly likely to be deleterious. The “preadapting selection” hypothesis claims that emergence is facilitated by prior, low-level translation of noncoding sequences via molecular errors. It predicts that selection on polypeptides translated only in error is strong enough to matter and is strongest when erroneous expression is high. To test this hypothesis, we examined noncoding sequences located downstream of stop codons (i.e., those potentially translated by readthrough errors) in Saccharomyces cerevisiae genes. We identified a class of “fragile” proteins under strong selection to reduce readthrough, which are unlikely substrates for co-option. Among the remainder, sequences showing evidence of readthrough translation, as assessed by ribosome profiling, encoded C-terminal extensions with higher intrinsic structural disorder, supporting the preadapting selection hypothesis. The cryptic sequences beyond the stop codon, rather than spillover effects from the regular C-termini, are primarily responsible for the higher disorder. Results are robust to controlling for the fact that stronger selection also reduces the length of C-terminal extensions. These findings indicate that selection acts on 3′ UTRs in Saccharomyces cerevisiae to purge potentially deleterious variants of cryptic polypeptides, acting more strongly in genes that experience more readthrough errors.

Download Full-text

Differences between the de novo proteome and its non-functional precursor can result from neutral constraints on its birth process, not necessarily from natural selection alone

10.1101/289330 ◽

2018 ◽

Cited By ~ 4

Author(s):

Lou Nielly-Thibault ◽

Christian R Landry

Keyword(s):

Natural Selection ◽

De Novo ◽

Random Sequence ◽

Gc Content ◽

Intrinsic Disorder ◽

Raw Material ◽

Published Data ◽

Novel Proteins ◽

Evolutionary Forces ◽

The Mean

ABSTRACTProteins are among the most important constituents of biological systems. Because all proteins ultimately evolved from previously non-coding DNA, the properties of these non-coding sequences and how they shape the birth of novel proteins are also expected to influence the organization of biological networks. When trying to explain and predict the properties of novel proteins, it is of particular importance to distinguish the contributions of natural selection and other evolutionary forces. Studies in the field typically use non-coding DNA and GC-content-based random-sequence models to generate random expectations for the properties of novel functional proteins. Deviations from these expectations have been interpreted as the result of natural selection. However, interpreting such deviations requires a yet-unattained understanding of the raw material of de novo gene birth and its relation to novel functional proteins. We mathematically show how the importance of the “junk” polypeptides that make up this raw material goes beyond their average properties and their filtering by natural selection. We find that the mean of any property among novel functional proteins also depends on its variance among junk polypeptides and its correlation with their rate of evolutionary turnover. In order to exemplify the use of our general theoretical results, we combine them with a simple model that predicts the means and variances of the properties of junk polypeptides from the genomic GC content alone. Under this model, we predict the effect of GC content on the mean length and mean intrinsic disorder of novel functional proteins as a function of evolutionary parameters. We use these predictions to formulate new evolutionary interpretations of published data on the length and intrinsic disorder of novel functional proteins. This work provides a theoretical framework that can serve as a guide for the prediction and interpretation of past and future results in the study of novel proteins and their properties under various evolutionary models. Our results provide the foundation for a better understanding of the properties of cellular networks through the evolutionary origin of their components.

Download Full-text