Function-guided protein design by deep manifold sampling

AbstractA wide variety of protein and peptidomimetic design tasks require matching functional three-dimensional motifs to potential oligomeric scaffolds. Enzyme design, for example, aims to graft active-site patterns typically consisting of 3 to 15 residues onto new protein surfaces. Identifying suitable proteins capable of scaffolding such active-site engraftment requires costly searches to identify protein folds that can provide the correct positioning of side chains to host the desired active site. Other examples of biodesign tasks that require simpler fast exact geometric searches of potential side chain positioning include mimicking binding hotspots, design of metal binding clusters and the design of modular hydrogen binding networks for specificity. In these applications the speed and scaling of geometric search limits downstream design to small patterns. Here we present an adaptive algorithm to searching for side chain take-off angles compatible with an arbitrarily specified functional pattern that enjoys substantive performance improvements over previous methods. We demonstrate this method in both genetically encoded (protein) and synthetic (peptidomimetic) design scenarios. Examples of using this method with the Rosetta framework for protein design are provided but our implementation is compatible with multiple protein design frameworks and is freely available as a set of python scripts (https://github.com/JiangTian/adaptive-geometric-search-for-protein-design).

Download Full-text

Amino-acid site variability among natural and designed proteins

10.7287/peerj.preprints.74 ◽

2013 ◽

Author(s):

Eleisha L. Jackson ◽

Noah Ollikainen ◽

Arthur W. Covert III ◽

Tanja Kortemme ◽

Claus O. Wilke

Keyword(s):

Amino Acid ◽

Protein Design ◽

Protein Sequences ◽

Structural Constraints ◽

Scoring Functions ◽

Solvent Exposure ◽

Backbone Flexibility ◽

Hydrophobic Residues ◽

Designed Proteins ◽

Site Variability

Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.

Download Full-text

Test time augmentation by regular shifting for deep denoising autoencoder networks

10.1109/ijcnn52387.2021.9534044 ◽

2021 ◽

Author(s):

Jose A. Rodriguez-Rodriguez ◽

Miguel A. Molina-Cabello ◽

Rafaela Benitez-Rochel ◽

Ezequiel Lopez-Rubio

Keyword(s):

Test Time ◽

Denoising Autoencoder

Download Full-text

Unevolved De Novo Proteins Have Innate Tendencies to Bind Transition Metals

Life ◽

10.3390/life9010008 ◽

2019 ◽

Vol 9 (1) ◽

pp. 8 ◽

Cited By ~ 4

Author(s):

Michael S. Wang ◽

Kenric J. Hoegler ◽

Michael H. Hecht

Keyword(s):

Amino Acid ◽

Transition Metals ◽

Metal Binding ◽

Combinatorial Library ◽

De Novo ◽

Protein Sequences ◽

Amino Acid Sequences ◽

Ancestral Sequences ◽

Wide Range ◽

Catalytic Functions

Life as we know it would not exist without the ability of protein sequences to bind metal ions. Transition metals, in particular, play essential roles in a wide range of structural and catalytic functions. The ubiquitous occurrence of metalloproteins in all organisms leads one to ask whether metal binding is an evolved trait that occurred only rarely in ancestral sequences, or alternatively, whether it is an innate property of amino acid sequences, occurring frequently in unevolved sequence space. To address this question, we studied 52 proteins from a combinatorial library of novel sequences designed to fold into 4-helix bundles. Although these sequences were neither designed nor evolved to bind metals, the majority of them have innate tendencies to bind the transition metals copper, cobalt, and zinc with high nanomolar to low-micromolar affinity.

Download Full-text

Tight and specific lanthanide binding in a de novo TIM barrel with a large internal cavity designed by symmetric domain fusion

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2008535117 ◽

2020 ◽

Vol 117 (48) ◽

pp. 30362-30369

Author(s):

Shane J. Caldwell ◽

Ian C. Haydon ◽

Nikoletta Piperidou ◽

Po-Ssu Huang ◽

Matthew J. Bick ◽

...

Keyword(s):

Protein Design ◽

Metal Binding ◽

De Novo ◽

Enzymatic Reaction ◽

Symmetric Domain ◽

Lanthanide Ions ◽

Internal Cavity ◽

Tim Barrel ◽

Reaction Chambers ◽

Modular Platform

De novo protein design has succeeded in generating a large variety of globular proteins, but the construction of protein scaffolds with cavities that could accommodate large signaling molecules, cofactors, and substrates remains an outstanding challenge. The long, often flexible loops that form such cavities in many natural proteins are difficult to precisely program and thus challenging for computational protein design. Here we describe an alternative approach to this problem. We fused two stable proteins with C2 symmetry—a de novo designed dimeric ferredoxin fold and a de novo designed TIM barrel—such that their symmetry axes are aligned to create scaffolds with large cavities that can serve as binding pockets or enzymatic reaction chambers. The crystal structures of two such designs confirm the presence of a 420 cubic Ångström chamber defined by the top of the designed TIM barrel and the bottom of the ferredoxin dimer. We functionalized the scaffold by installing a metal-binding site consisting of four glutamate residues close to the symmetry axis. The protein binds lanthanide ions with very high affinity as demonstrated by tryptophan-enhanced terbium luminescence. This approach can be extended to other metals and cofactors, making this scaffold a modular platform for the design of binding proteins and biocatalysts.

Download Full-text

Using Protein Design To Dissect the Effect of Charged Residues on Metal Binding and Protein Stability†

Biochemistry ◽

10.1021/bi052508q ◽

2006 ◽

Vol 45 (18) ◽

pp. 5848-5856 ◽

Cited By ~ 12

Author(s):

Anna Wilkins Maniccia ◽

Wei Yang ◽

Shun-yi Li ◽

Julian A. Johnson ◽

Jenny J. Yang

Keyword(s):

Protein Stability ◽

Protein Design ◽

Metal Binding ◽

Charged Residues

Download Full-text

Secure machine learning against adversarial samples at test time

EURASIP Journal on Information Security ◽

10.1186/s13635-021-00125-2 ◽

2022 ◽

Vol 2022 (1) ◽

Author(s):

Jing Lin ◽

Laurent L. Njilla ◽

Kaiqi Xiong

Keyword(s):

Machine Learning ◽

Parallel Implementation ◽

Standard Test ◽

Test Time ◽

Test Accuracy ◽

Learning Approaches ◽

Complex Models ◽

Projected Gradient Descent ◽

Adversarial Example ◽

Human Eyes

AbstractDeep neural networks (DNNs) are widely used to handle many difficult tasks, such as image classification and malware detection, and achieve outstanding performance. However, recent studies on adversarial examples, which have maliciously undetectable perturbations added to their original samples that are indistinguishable by human eyes but mislead the machine learning approaches, show that machine learning models are vulnerable to security attacks. Though various adversarial retraining techniques have been developed in the past few years, none of them is scalable. In this paper, we propose a new iterative adversarial retraining approach to robustify the model and to reduce the effectiveness of adversarial inputs on DNN models. The proposed method retrains the model with both Gaussian noise augmentation and adversarial generation techniques for better generalization. Furthermore, the ensemble model is utilized during the testing phase in order to increase the robust test accuracy. The results from our extensive experiments demonstrate that the proposed approach increases the robustness of the DNN model against various adversarial attacks, specifically, fast gradient sign attack, Carlini and Wagner (C&W) attack, Projected Gradient Descent (PGD) attack, and DeepFool attack. To be precise, the robust classifier obtained by our proposed approach can maintain a performance accuracy of 99% on average on the standard test set. Moreover, we empirically evaluate the runtime of two of the most effective adversarial attacks, i.e., C&W attack and BIM attack, to find that the C&W attack can utilize GPU for faster adversarial example generation than the BIM attack can. For this reason, we further develop a parallel implementation of the proposed approach. This parallel implementation makes the proposed approach scalable for large datasets and complex models.

Download Full-text

Amino-acid site variability among natural and designed proteins

10.7287/peerj.preprints.74v1 ◽

2013 ◽

Author(s):

Eleisha L. Jackson ◽

Noah Ollikainen ◽

Arthur W. Covert III ◽

Tanja Kortemme ◽

Claus O. Wilke

Keyword(s):

Amino Acid ◽

Protein Design ◽

Protein Sequences ◽

Structural Constraints ◽

Scoring Functions ◽

Solvent Exposure ◽

Backbone Flexibility ◽

Hydrophobic Residues ◽

Designed Proteins ◽

Site Variability

Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.

Download Full-text

Using AlphaFold for Rapid and Accurate Fixed Backbone Protein Design

10.1101/2021.08.24.457549 ◽

2021 ◽

Cited By ~ 1

Author(s):

Lewis Moffat ◽

Joe G. Greener ◽

David T. Jones

Keyword(s):

Protein Structure ◽

Ab Initio ◽

Protein Structure Prediction ◽

Protein Design ◽

Structure Prediction ◽

Predictive Power ◽

Protein Sequences ◽

Supervised Methods ◽

New Generation ◽

Novel Protein

AbstractThe prediction of protein structure and the design of novel protein sequences and structures have long been intertwined. The recently released AlphaFold has heralded a new generation of accurate protein structure prediction, but the extent to which this affects protein design stands yet unexplored. Here we develop a rapid and effective approach for fixed backbone computational protein design, leveraging the predictive power of AlphaFold. For several designs we demonstrate that not only are the AlphaFold predicted structures in agreement with the desired backbones, but they are also supported by the structure predictions of other supervised methods as well as ab initio folding. These results suggest that AlphaFold, and methods like it, are able to facilitate the development of a new range of novel and accurate protein design methodologies.

Download Full-text

Protein design: novel metal-binding sites

Trends in Biochemical Sciences ◽

10.1016/s0968-0004(00)89044-1 ◽

1995 ◽

Vol 20 (7) ◽

pp. 280-285 ◽

Cited By ~ 104

Author(s):

Lynne Regan

Keyword(s):

Protein Design ◽

Metal Binding ◽

Binding Sites ◽

Metal Binding Sites

Download Full-text