Protein Stability Prediction:  A Poisson−Boltzmann Approach

<div>Engineering proteins to have desired properties by mutating amino acids at specific sites is commonplace. Such engineered proteins must be stable to function. Experimental methods used to determine stability at throughputs required to scan the protein sequence space thoroughly are laborious. To this end, many machine learning based methods have been developed to predict thermodynamic stability changes upon mutation. These methods have been evaluated for symmetric consistency by testing with hypothetical reverse mutations. In this work, we propose transitive data augmentation, evaluating transitive consistency, and a new machine learning based method, first of its kind, that incorporates both symmetric and transitive properties into the architecture. Our method, called SCONES, is an interpretable neural network that estimates a residue's contributions towards protein stability dG in its local structural environment. The difference between independently predicted contributions of the reference and mutant residues in a missense mutation is reported as dG. We show that this self-consistent machine learning architecture is immune to many common biases in datasets, relies less on data than existing methods, and is robust to overfitting.</div><div><br></div>

Download Full-text

Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1903888116 ◽

2019 ◽

Vol 116 (33) ◽

pp. 16367-16377 ◽

Cited By ~ 22

Author(s):

Alex Nisthal ◽

Connie Y. Wang ◽

Marie L. Ary ◽

Stephen L. Mayo

Keyword(s):

Protein Stability ◽

Amino Acid Type ◽

Stability Prediction ◽

Single Mutant ◽

Prediction Tools ◽

Prediction Algorithms ◽

Automated Method ◽

Acid Type ◽

Stability Data ◽

Neutral Effect

The accurate prediction of protein stability upon sequence mutation is an important but unsolved challenge in protein engineering. Large mutational datasets are required to train computational predictors, but traditional methods for collecting stability data are either low-throughput or measure protein stability indirectly. Here, we develop an automated method to generate thermodynamic stability data for nearly every single mutant in a small 56-residue protein. Analysis reveals that most single mutants have a neutral effect on stability, mutational sensitivity is largely governed by residue burial, and unexpectedly, hydrophobics are the best tolerated amino acid type. Correlating the output of various stability-prediction algorithms against our data shows that nearly all perform better on boundary and surface positions than for those in the core and are better at predicting large-to-small mutations than small-to-large ones. We show that the most stable variants in the single-mutant landscape are better identified using combinations of 2 prediction algorithms and including more algorithms can provide diminishing returns. In most cases, poor in silico predictions were tied to compositional differences between the data being analyzed and the datasets used to train the algorithm. Finally, we find that strategies to extract stabilities from high-throughput fitness data such as deep mutational scanning are promising and that data produced by these methods may be applicable toward training future stability-prediction tools.

Download Full-text

Symmetry Principles in Optimization Problems: an application to Protein Stability Prediction★

IFAC-PapersOnLine ◽

10.1016/j.ifacol.2015.05.068 ◽

2015 ◽

Vol 48 (1) ◽

pp. 458-463 ◽

Cited By ~ 12

Author(s):

Fabrizio Pucci ◽

Katrien Bernaerts ◽

Fabian Teheux ◽

Dimitri Gilis ◽

Marianne Rooman

Keyword(s):

Protein Stability ◽

Optimization Problems ◽

Stability Prediction ◽

Symmetry Principles

Download Full-text

MAESTROweb: a web server for structure-based protein stability prediction

Bioinformatics ◽

10.1093/bioinformatics/btv769 ◽

2016 ◽

Vol 32 (9) ◽

pp. 1414-1416 ◽

Cited By ~ 36

Author(s):

Josef Laimer ◽

Julia Hiebl-Flach ◽

Daniel Lengauer ◽

Peter Lackner

Keyword(s):

Protein Stability ◽

Web Server ◽

Stability Prediction

Download Full-text

Modeling the Influence of Salt on the Hydrophobic Effect and Protein Fold Stability

Communications in Computational Physics ◽

10.4208/cicp.290711.121011s ◽

2013 ◽

Vol 13 (1) ◽

pp. 90-106 ◽

Cited By ~ 17

Author(s):

Mihir S. Date ◽

Brian N. Dominy

Keyword(s):

Protein Stability ◽

Hydrophobic Effect ◽

Quantitative Agreement ◽

Cold Shock Protein ◽

Protein Fold ◽

Nacl Concentration ◽

Dependent Protein ◽

Poisson Boltzmann ◽

Fold Stability ◽

The Impact

AbstractSalt influences protein stability through electrostatic mechanisms as well as through nonpolar Hofmeister effects. In the present work, a continuum solvation based model is developed to explore the impact of salt on protein stability. This model relies on a traditional Poisson-Boltzmann (PB) term to describe the polar or electrostatic effects of salt, and a surface area dependent term containing a salt concentration dependent microscopic surface tension function to capture the non-polar Hofmeister effects. The model is first validated against a series of cold-shock protein variants whose salt-dependent protein fold stability profiles have been previously determined experimentally. The approach is then applied to HIV-1 protease in order to explain an experimentally observed enhancement in stability and activity at high (1M) NaCl concentration. The inclusion of the salt-dependent non-polar term brings the model into quantitative agreement with experiment, and provides the basis for further studies into the impact of ionic strength on protein structure, function, and evolution.

Download Full-text

Improved mutation tagging with gene identifiers applied to membrane protein stability prediction

BMC Bioinformatics ◽

10.1186/1471-2105-10-s8-s3 ◽

2009 ◽

Vol 10 (Suppl 8) ◽

pp. S3 ◽

Cited By ~ 10

Author(s):

Rainer Winnenburg ◽

Conrad Plake ◽

Michael Schroeder

Keyword(s):

Membrane Protein ◽

Protein Stability ◽

Stability Prediction ◽

Membrane Protein Stability

Download Full-text

Grading amino acid properties increased accuracies of single point mutation on protein stability prediction

BMC Bioinformatics ◽

10.1186/1471-2105-13-44 ◽

2012 ◽

Vol 13 (1) ◽

Cited By ~ 6

Author(s):

Jianguo Liu ◽

Xianjiang Kang

Keyword(s):

Amino Acid ◽

Protein Stability ◽

Point Mutation ◽

Single Point ◽

Single Point Mutation ◽

Stability Prediction ◽

Acid Properties ◽

Amino Acid Properties

Download Full-text

A Two Stage Model on Prediction of Protein Stability Changes in Case of Uncertainty using Fuzzy K-Means Clustering and Fuzzy Artificial Neural Networks

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1123.0782s319 ◽

2019 ◽

Vol 8 (2S3) ◽

pp. 666-671

Keyword(s):

Protein Stability ◽

Basic Research ◽

Industrial Applications ◽

Stage Model ◽

Prediction System ◽

Stability Prediction ◽

Protein Thermostability ◽

Accuracy Rate ◽

Artificial Neural

In both industrial applications and basic research the manipulation of protein stability is essential for knowing the principles which govern protein thermostability. This leads to hotspot in data mining based protein engineering and stability prediction. There are so many works related to the prediction of protein stability but they all lack in data preprocessing, presence of duplicates in the dataset and ability to handle uncertainty present in them. The main aim of this paper is to enhance the quality of the protein stability dataset and to increase the accuracy rate of prediction system. For deduplication process fuzzy K-means (FKM)based clustering is applied to cluster and match the duplicate records and eradicate them. To handle the uncertainty Fuzzy Artificial Neural Network (FANN) is used to perform prediction on protein stability. Simulation results proved the efficiency of FKM-FANN which yields excellent results comparing the existing methods

Download Full-text

Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis

10.1101/484949 ◽

2018 ◽

Cited By ~ 3

Author(s):

Alex Nisthal ◽

Connie Y. Wang ◽

Marie L. Ary ◽

Stephen L. Mayo

Keyword(s):

Amino Acid ◽

Protein Stability ◽

High Throughput ◽

Thermodynamic Stability ◽

Amino Acid Type ◽

Stability Prediction ◽

Single Mutant ◽

Prediction Algorithms ◽

Automated Method ◽

Stability Data

AbstractThe accurate prediction of protein stability upon sequence mutation is an important but unsolved challenge in protein engineering. Large mutational datasets are required to train computational predictors, but traditional methods for collecting stability data are either low-throughput or measure protein stability indirectly. Here, we develop an automated method to generate thermodynamic stability data for nearly every single mutant in a small 56-residue protein. Analysis reveals that most single mutants have a neutral effect on stability, mutational sensitivity is largely governed by residue burial, and unexpectedly, hydrophobics are the best tolerated amino acid type. Correlating the output of various stability prediction algorithms against our data shows that nearly all perform better on boundary and surface positions than for those in the core, and are better at predicting large to small mutations than small to large ones. We show that the most stable variants in the single mutant landscape are better identified using combinations of two prediction algorithms, and that including more algorithms can provide diminishing returns. In most cases, poor in silico predictions were tied to compositional differences between the data being analyzed and the datasets used to train the algorithm. Finally, we find that strategies to extract stabilities from high-throughput fitness data such as deep mutational scanning are promising and that data produced by these methods may be applicable toward training future stability prediction tools.Significance StatementUsing liquid-handling automation, we constructed and measured the thermodynamic stability of almost every single mutant of protein G (Gβ1), a small domain. This self-consistent dataset is the largest of its kind and offers unique opportunities on two fronts: (i) insight into protein domain properties such as positional sensitivity and incorporated amino acid tolerance, and (ii) service as a validation set for future efforts in protein stability prediction. As Gβ1 is a model system for protein folding and design, and its single mutant landscape has been measured by deep mutational scanning, we expect our dataset to serve as a reference for studies aimed at extracting stability information from fitness data or developing novel high-throughput stability assays.

Download Full-text