CATS (Coordinates of Atoms by Taylor Series): protein design with backbone flexibility in all locally feasible directions

Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.

Download Full-text

Positive multistate protein design

Bioinformatics ◽

10.1093/bioinformatics/btz497 ◽

2019 ◽

Vol 36 (1) ◽

pp. 122-130

Author(s):

Jelena Vucinic ◽

David Simoncini ◽

Manon Ruffini ◽

Sophie Barbe ◽

Thomas Schiex

Keyword(s):

Protein Design ◽

Critical Role ◽

Average Energy ◽

Computational Design ◽

Amino Acid Sequences ◽

Supplementary Information ◽

Backbone Flexibility ◽

Identify Amino Acid ◽

Design Software ◽

Protein Redesign

Abstract Motivation Structure-based computational protein design (CPD) plays a critical role in advancing the field of protein engineering. Using an all-atom energy function, CPD tries to identify amino acid sequences that fold into a target structure and ultimately perform a desired function. The usual approach considers a single rigid backbone as a target, which ignores backbone flexibility. Multistate design (MSD) allows instead to consider several backbone states simultaneously, defining challenging computational problems. Results We introduce efficient reductions of positive MSD problems to Cost Function Networks with two different fitness definitions and implement them in the Pompd (Positive Multistate Protein design) software. Pompd is able to identify guaranteed optimal sequences of positive multistate full protein redesign problems and exhaustively enumerate suboptimal sequences close to the MSD optimum. Applied to nuclear magnetic resonance and back-rubbed X-ray structures, we observe that the average energy fitness provides the best sequence recovery. Our method outperforms state-of-the-art guaranteed computational design approaches by orders of magnitudes and can solve MSD problems with sizes previously unreachable with guaranteed algorithms. Availability and implementation https://forgemia.inra.fr/thomas.schiex/pompd as documented Open Source. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Computational protein design with backbone plasticity

Biochemical Society Transactions ◽

10.1042/bst20160155 ◽

2016 ◽

Vol 44 (5) ◽

pp. 1523-1529 ◽

Cited By ~ 13

Author(s):

James T. MacDonald ◽

Paul S. Freemont

Keyword(s):

Protein Design ◽

De Novo ◽

Protein Structures ◽

Search Space ◽

Computational Protein Design ◽

Artificial Enzymes ◽

Backbone Flexibility ◽

Artificial Proteins ◽

Naturally Occurring ◽

Backbone Structure

The computational algorithms used in the design of artificial proteins have become increasingly sophisticated in recent years, producing a series of remarkable successes. The most dramatic of these is the de novo design of artificial enzymes. The majority of these designs have reused naturally occurring protein structures as ‘scaffolds’ onto which novel functionality can be grafted without having to redesign the backbone structure. The incorporation of backbone flexibility into protein design is a much more computationally challenging problem due to the greatly increased search space, but promises to remove the limitations of reusing natural protein scaffolds. In this review, we outline the principles of computational protein design methods and discuss recent efforts to consider backbone plasticity in the design process.

Download Full-text

Backbone flexibility in computational protein design

Current Opinion in Biotechnology ◽

10.1016/j.copbio.2009.07.006 ◽

2009 ◽

Vol 20 (4) ◽

pp. 420-428 ◽

Cited By ~ 79

Author(s):

Daniel J Mandell ◽

Tanja Kortemme

Keyword(s):

Protein Design ◽

Computational Protein Design ◽

Backbone Flexibility

Download Full-text

Amino-acid site variability among natural and designed proteins

10.7287/peerj.preprints.74v1 ◽

2013 ◽

Author(s):

Eleisha L. Jackson ◽

Noah Ollikainen ◽

Arthur W. Covert III ◽

Tanja Kortemme ◽

Claus O. Wilke

Keyword(s):

Amino Acid ◽

Protein Design ◽

Protein Sequences ◽

Structural Constraints ◽

Scoring Functions ◽

Solvent Exposure ◽

Backbone Flexibility ◽

Hydrophobic Residues ◽

Designed Proteins ◽

Site Variability

Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.

Download Full-text

Overcoming the Key Challenges in De Novo Protein Design: Enhancing Computational Efficiency and Incorporating True Backbone Flexibility

Applied Optimization - Mathematical Modelling of Biosystems ◽

10.1007/978-3-540-76784-8_4 ◽

2008 ◽

pp. 133-183 ◽

Cited By ~ 1

Author(s):

Christodoulos A. Floudas ◽

Ho Ki Fung ◽

Dimitrios Morikis ◽

Martin S. Taylor ◽

Li Zhang

Keyword(s):

Protein Design ◽

Computational Efficiency ◽

De Novo ◽

Backbone Flexibility ◽

De Novo Protein Design

Download Full-text

Coupling backbone flexibility and amino acid sequence selection in protein design

Protein Science ◽

10.1002/pro.5560060810 ◽

1997 ◽

Vol 6 (8) ◽

pp. 1701-1707 ◽

Cited By ~ 77

Author(s):

Alyce Su ◽

Stephen L. Mayo

Keyword(s):

Amino Acid ◽

Amino Acid Sequence ◽

Protein Design ◽

Backbone Flexibility ◽

Sequence Selection

Download Full-text

Comparison of Rosetta flexible-backbone computational protein design methods on binding interactions

10.1101/674291 ◽

2019 ◽

Author(s):

Amanda L. Loshbaugh ◽

Tanja Kortemme

Keyword(s):

Protein Design ◽

Computational Design ◽

Design Methods ◽

Backbone Flexibility ◽

Functional Protein ◽

Binding Interactions ◽

Sequence Design ◽

Sequence Profiles ◽

Sampling Trajectory ◽

Powerful Strategy

ABSTRACTComputational design of binding sites in proteins remains difficult, in part due to limitations in our current ability to sample backbone conformations that enable precise and accurate geometric positioning of side chains during sequence design. Here we present a benchmark framework for comparison between flexible-backbone design methods applied to binding interactions. We quantify the ability of different flexible backbone design methods in the widely used protein design software Rosetta to recapitulate observed protein sequence profiles assumed to represent functional protein/protein and protein/small molecule binding interactions. The CoupledMoves method, which combines backbone flexibility and sequence exploration into a single acceptance step during the sampling trajectory, better recapitulates observed sequence profiles than the BackrubEnsemble and FastDesign methods, which separate backbone flexibility and sequence design into separate acceptance steps during the sampling trajectory. Flexible-backbone design with the CoupledMoves method is a powerful strategy for reducing sequence space to generate targeted libraries for experimental screening and selection.

Download Full-text

Dead-end elimination with perturbations (DEEPer): A provable protein design algorithm with continuous sidechain and backbone flexibility

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.24150 ◽

2012 ◽

Vol 81 (1) ◽

pp. 18-39 ◽

Cited By ~ 67

Author(s):

Mark A. Hallen ◽

Daniel A. Keedy ◽

Bruce R. Donald

Keyword(s):

Protein Design ◽

Backbone Flexibility ◽

Design Algorithm ◽

Dead End

Download Full-text

PLUG (Pruning of Local Unrealistic Geometries) removes restrictions on biophysical modeling for protein design

10.1101/368522 ◽

2018 ◽

Author(s):

Mark A. Hallen

Keyword(s):

Protein Design ◽

Conformational Space ◽

Biophysical Model ◽

Biophysical Modeling ◽

Backbone Flexibility ◽

Energy Functions ◽

Design Algorithm ◽

Dead End ◽

Pruning Algorithms ◽

Design Calculations

AbstractProtein design algorithms must search an enormous conformational space to identify favorable conformations. As a result, those that perform this search with guarantees of accuracy generally start with a conformational pruning step, such as dead-end elimination (DEE). However, the mathematical assumptions of DEE-based pruning algorithms have up to now severely restricted the biophysical model that can feasibly be used in protein design. To lift these restrictions, I propose to prune local unrealistic geometries (PLUG) using a linear programming-based method. PLUG’s biophysical model consists only of well-known lower bounds on interatomic distances. PLUG is intended as pre-processing for energy-based protein design calculations, whose biophysical model need not support DEE pruning. Based on 96 test cases, PLUG is at least as effective at pruning as DEE for larger protein designs—the type that most require pruning. When combined with the LUTE protein design algorithm, PLUG greatly facilitates designs that account for continuous entropy, large multistate designs with continuous flexibility, and designs with extensive continuous backbone flexibility and advanced non-pairwise energy functions. Many of these designs are tractable only with PLUG, either for empirical reasons (LUTE’s machine learning step achieves an accurate fit only after PLUG pruning), or for theoretical reasons (many energy functions are fundamentally incompatible with DEE).

Download Full-text