scholarly journals Heterogeneity of the GFP fitness landscape and data-driven protein design

2021 ◽  
Author(s):  
Louisa Gonzalez Somermeyer ◽  
Aubin Fleiss ◽  
Alexander S Mishin ◽  
Nina G Bozhanova ◽  
Anna A. Igolkina ◽  
...  

Studies of protein fitness landscapes reveal biophysical constraints guiding protein evolution and empower prediction of functional proteins. However, generalisation of these findings is limited due to scarceness of systematic data on fitness landscapes of proteins with a defined evolutionary relationship. We characterized the fitness peaks of four orthologous fluorescent proteins with a broad range of sequence divergence. While two of the four studied fitness peaks were sharp, the other two were considerably flatter, being almost entirely free of epistatic interactions. Counterintuitively, mutationally robust proteins, characterized by a flat fitness peak, were not optimal templates for machine-learning-driven protein design – instead, predictions were more accurate for fragile proteins with epistatic landscapes. Our work paves insights for practical application of fitness landscape heterogeneity in protein engineering.

2018 ◽  
Author(s):  
Christelle Fraïsse ◽  
John J. Welch

AbstractFitness interactions between mutations can influence a population’s evolution in many different ways. While epistatic effects are difficult to measure precisely, important information about the overall distribution is captured by the mean and variance of log fitnesses for individuals carrying different numbers of mutations. We derive predictions for these quantities from simple fitness landscapes, based on models of optimizing selection on quantitative traits. We also explore extensions to the models, including modular pleiotropy, variable effects sizes, mutational bias, and maladaptation of the wild-type. We illustrate our approach by reanalysing a large data set of mutant effects in a yeast snoRNA. Though characterized by some strong epistatic interactions, these data give a good overall fit to the non-epistatic null model, suggesting that epistasis might have little effect on the evolutionary dynamics in this system. We also show how the amount of epistasis depends on both the underlying fitness landscape, and the distribution of mutations, and so it is expected to vary in consistent ways between new mutations, standing variation, and fixed mutations.


2017 ◽  
Author(s):  
Daniel M. Weinreich ◽  
Yinghong Lan ◽  
Jacob Jaffe ◽  
Robert B. Heckendorn

AbstractThe effect of a mutation on the organism often depends on what other mutations are already present in its genome. Geneticists refer to such mutational interactions as epistasis. Pairwise epistatic effects have been recognized for over a century, and their evolutionary implications have received theoretical attention for nearly as long. However, pairwise epistatic interactions themselves can vary with genomic background. This is called higher-order epistasis, and its consequences for evolution are much less well understood. Here, we assess the influence that higher-order epistasis has on the topography of 16 published, biological fitness landscapes. We find that on average, their effects on fitness landscape declines with order, and suggest that notable exceptions to this trend may deserve experimental scrutiny. We explore whether natural selection may have contributed to this finding, and conclude by highlight opportunities for further work dissecting the influence that epistasis of all orders has on the efficiency of natural selection.


2016 ◽  
Author(s):  
Claudia Bank ◽  
Sebastian Matuszewski ◽  
Ryan T. Hietpas ◽  
Jeffrey D. Jensen

AbstractThe study of fitness landscapes, which aims at mapping genotypes to fitness, is receiving ever-increasing attention. Novel experimental approaches combined with NGS methods enable accurate and extensive studies of the fitness effects of mutations – allowing us to test theoretical predictions and improve our understanding of the shape of the true underlying fitness landscape, and its implications for the predictability and repeatability of evolution.Here, we present a uniquely large multi-allelic fitness landscape comprised of 640 engineered mutants that represent all possible combinations of 13 amino-acid changing mutations at six sites in the heat-shock protein Hsp90 in Saccharomyces cerevisiae under elevated salinity. Despite a prevalent pattern of negative epistasis in the landscape, we find that the global fitness peak is reached via four positively epistatic mutations. Combining traditional and extending recently proposed theoretical and statistical approaches, we quantify features of the global multi-allelic fitness landscape. Using subsets of the data, we demonstrate that extrapolation beyond a known part of the landscape is difficult owing to both local ruggedness and amino-acid specific epistatic hotspots, and that inference is additionally confounded by the non-random choice of mutations for experimental fitness landscapes.Author SummaryThe study of fitness landscapes is fundamentally concerned with understanding the relative roles of stochastic and deterministic processes in adaptive evolution. Here, the authors present a uniquely large and complete multi-allelic intragenic fitness landscape of 640 systematically engineered mutations in yeast Hsp90. Using a combination of traditional and recently proposed theoretical approaches, they study the accessibility of the global fitness peak, and the potential for predictability of the fitness landscape topography. They report local ruggedness of the landscape and the existence of epistatic hotspot mutations, which together make extrapolation and hence predictability inherently difficult, if mutation-specific information is not considered.


Author(s):  
Nina Vyatkina

Data-Driven Learning (DDL), or a corpus-based method of language teaching and learning, has been developing rapidly since the turn of the century and has been shown to be effective and efficient. Nevertheless, DDL is still not widely used in regular classrooms for a number of reasons. One of them is that few workable pedagogical frameworks have been suggested for integrating DDL into language courses and curricula. This chapter describes an exemplar of a practical application of such a pedagogical framework to a high-intermediate university-level German as a foreign language course with a significant DDL component. The Design-Based Research approach is used as the main methodological framework. The chapter concludes with a discussion of wider pedagogical implications.


2016 ◽  
Vol 113 (11) ◽  
pp. E1470-E1478 ◽  
Author(s):  
João V. Rodrigues ◽  
Shimon Bershtein ◽  
Anna Li ◽  
Elena R. Lozovsky ◽  
Daniel L. Hartl ◽  
...  

Fitness landscapes of drug resistance constitute powerful tools to elucidate mutational pathways of antibiotic escape. Here, we developed a predictive biophysics-based fitness landscape of trimethoprim (TMP) resistance for Escherichia coli dihydrofolate reductase (DHFR). We investigated the activity, binding, folding stability, and intracellular abundance for a complete set of combinatorial DHFR mutants made out of three key resistance mutations and extended this analysis to DHFR originated from Chlamydia muridarum and Listeria grayi. We found that the acquisition of TMP resistance via decreased drug affinity is limited by a trade-off in catalytic efficiency. Protein stability is concurrently affected by the resistant mutants, which precludes a precise description of fitness from a single molecular trait. Application of the kinetic flux theory provided an accurate model to predict resistance phenotypes (IC50) quantitatively from a unique combination of the in vitro protein molecular properties. Further, we found that a controlled modulation of the GroEL/ES chaperonins and Lon protease levels affects the intracellular steady-state concentration of DHFR in a mutation-specific manner, whereas IC50 is changed proportionally, as indeed predicted by the model. This unveils a molecular rationale for the pleiotropic role of the protein quality control machinery on the evolution of antibiotic resistance, which, as we illustrate here, may drastically confound the evolutionary outcome. These results provide a comprehensive quantitative genotype–phenotype map for the essential enzyme that serves as an important target of antibiotic and anticancer therapies.


1997 ◽  
Vol 5 (3) ◽  
pp. 241-275 ◽  
Author(s):  
Peter F. Stadler ◽  
Günter P. Wagner

A new mathematical representation is proposed for the configuration space structure induced by recombination, which we call “P-structure.” It consists of a mapping of pairs of objects to the power set of all objects in the search space. The mapping assigns to each pair of parental “genotypes” the set of all recombinant genotypes obtainable from the parental ones. It is shown that this construction allows a Fourier decomposition of fitness landscapes into a superposition of “elementary landscapes.” This decomposition is analogous to the Fourier decomposition of fitness landscapes on mutation spaces. The elementary landscapes are obtained as eigenfunctions of a Laplacian operator defined for P-structures. For binary string recombination, the elementary landscapes are exactly the p-spin functions (Walsh functions), that is, the same as the elementary landscapes of the string point mutation spaces (i.e., the hypercube). This supports the notion of a strong homomorphism between string mutation and recombination spaces. However, the effective nearest neighbor correlations on these elementary landscapes differ between mutation and recombination and among different recombination operators. On average, the nearest neighbor correlation is higher for one-point recombination than for uniform recombination. For one-point recombination, the correlations are higher for elementary landscapes with fewer interacting sites as well as for sites that have closer linkage, confirming the qualitative predictions of the Schema Theorem. We conclude that the algebraic approach to fitness landscape analysis can be extended to recombination spaces and provides an effective way to analyze the relative hardness of a landscape for a given recombination operator.


2019 ◽  
Vol 15 (4) ◽  
pp. 20180881 ◽  
Author(s):  
Christelle Fraïsse ◽  
John J. Welch

Fitness interactions between mutations can influence a population’s evolution in many different ways. While epistatic effects are difficult to measure precisely, important information is captured by the mean and variance of log fitnesses for individuals carrying different numbers of mutations. We derive predictions for these quantities from a class of simple fitness landscapes, based on models of optimizing selection on quantitative traits. We also explore extensions to the models, including modular pleiotropy, variable effect sizes, mutational bias and maladaptation of the wild type. We illustrate our approach by reanalysing a large dataset of mutant effects in a yeast snoRNA (small nucleolar RNA). Though characterized by some large epistatic effects, these data give a good overall fit to the non-epistatic null model, suggesting that epistasis might have limited influence on the evolutionary dynamics in this system. We also show how the amount of epistasis depends on both the underlying fitness landscape and the distribution of mutations, and so is expected to vary in consistent ways between new mutations, standing variation and fixed mutations.


2017 ◽  
Author(s):  
Manasi A. Pethe ◽  
Aliza B. Rubenstein ◽  
Dmitri Zorine ◽  
Sagar D. Khare

Biophysical interactions between proteins and peptides are key determinants of genotype-fitness landscapes, but an understanding of how molecular structure and residue-level energetics at protein-peptide interfaces shape functional landscapes remains elusive. Combining information from yeast-based library screening, next-generation sequencing and structure-based modeling, we report comprehensive sequence-energetics-function mapping of the specificity landscape of the Hepatitis C Virus (HCV) NS3/4A protease, whose function — site-specific cleavages of the viral polyprotein — is a key determinant of viral fitness. We elucidate the cleavability of 3.2 million substrate variants by the HCV protease and find extensive clustering of cleavable and uncleavable motifs in sequence space indicating mutational robustness, and thereby providing a plausible molecular mechanism to buffer the effects of low replicative fidelity of this RNA virus. Specificity landscapes of known drug-resistant variants are similarly clustered. Our results highlight the key and constraining role of molecular-level energetics in shaping plateau-like fitness landscapes from quasispecies theory.


Sign in / Sign up

Export Citation Format

Share Document