scholarly journals Applications of AlphaFold beyond Protein Structure Prediction

2021 ◽  
Author(s):  
Yuan Zhang ◽  
Peizhao Li ◽  
FENG PAN ◽  
Hongfu Liu ◽  
Pengyu Hong ◽  
...  

Solving the half-century-old protein structure prediction problem by DeepMind's AlphaFold is certainly one of the greatest breakthroughs in biology in the twenty-first century. This breakthrough paved the way for tackling some previously highly challenging or even infeasible problems in structural biology. In this study, we propose strategies to use AlphaFold to address several fundamental problems: (1) protein engineering by predicting the experimentally measured stability changes using the representations extracted from AlphaFold models; (2) estimating the designability of a given protein structure by combining a protein design method (e.g. ProDCoNN), sequential Monte Carlo, and AlphaFold. The designability of a protein structure is defined as the number of sequences that encode that protein structure.; (3) predicting protein stabilities using natural sequences and designed sequences as training data, and representations extracted from AlphaFold models as input features; and (4) understanding the sequence-structure relationship of proteins by computational mutagenesis and testing the foldability of the mutants by AlphaFold. We found the representations extracted from AlphaFold models can be used to predict the experimentally measured stability changes accurately. For the first time, we have estimated the designability for a few real proteins. For example, the designability of chain A of FLT3 ligand (PDB ID: 1ETE) with 134 residues was estimated as 3.12 ± 2.14E85.

Author(s):  
Jinbo Xu ◽  
Matthew Mcpartlon ◽  
Jin Li

We describe our latest study of the deep convolutional residual neural networks (ResNet) for protein structure prediction, including deeper and wider ResNets, the efficacy of different input features, and improved 3D model building methods. Our ResNet can predict correct folds (TMscore>0.5) for 26 out of 32 CASP13 FM (template-free-modeling) targets and L/5 long-range contacts for these targets with precision over 80%, a significant improvement over the CASP13 results. Although co-evolution analysis plays an important role in the most successful structure prediction methods, we show that when co-evolution is not used, our ResNet can still predict correct folds for 18 of the 32 CASP13 FM targets including several large ones. This marks a significant improvement over the top co-evolution-based, non-deep learning methods at CASP13, and other non-coevolution-based deep learning models, such as the popular recurrent geometric network (RGN). With only primary sequence, our ResNet can also predict correct folds for all 21 human-designed proteins we tested. In contrast, RGN predicts correct folds for only 3 human-designed proteins and zero CASP13 FM target. In addition, we find that ResNet may fare better for the human-designed proteins when trained without co-evolution information than with co-evolution. These results suggest that ResNet does not simply denoise co-evolution signals, but instead is able to learn important sequence-structure relationship from experimental structures. This has important implications on protein design and engineering especially when evolutionary information is not available.


2019 ◽  
Author(s):  
Rebecca F. Alford ◽  
Patrick J. Fleming ◽  
Karen G. Fleming ◽  
Jeffrey J. Gray

ABSTRACTProtein design is a powerful tool for elucidating mechanisms of function and engineering new therapeutics and nanotechnologies. While soluble protein design has advanced, membrane protein design remains challenging due to difficulties in modeling the lipid bilayer. In this work, we developed an implicit approach that captures the anisotropic structure, shape of water-filled pores, and nanoscale dimensions of membranes with different lipid compositions. The model improves performance in computational bench-marks against experimental targets including prediction of protein orientations in the bilayer, ΔΔG calculations, native structure dis-crimination, and native sequence recovery. When applied to de novo protein design, this approach designs sequences with an amino acid distribution near the native amino acid distribution in membrane proteins, overcoming a critical flaw in previous membrane models that were prone to generating leucine-rich designs. Further, the proteins designed in the new membrane model exhibit native-like features including interfacial aromatic side chains, hydrophobic lengths compatible with bilayer thickness, and polar pores. Our method advances high-resolution membrane protein structure prediction and design toward tackling key biological questions and engineering challenges.Significance StatementMembrane proteins participate in many life processes including transport, signaling, and catalysis. They constitute over 30% of all proteins and are targets for over 60% of pharmaceuticals. Computational design tools for membrane proteins will transform the interrogation of basic science questions such as membrane protein thermodynamics and the pipeline for engineering new therapeutics and nanotechnologies. Existing tools are either too expensive to compute or rely on manual design strategies. In this work, we developed a fast and accurate method for membrane protein design. The tool is available to the public and will accelerate the experimental design pipeline for membrane proteins.


2019 ◽  
Author(s):  
Matthew Conover ◽  
Max Staples ◽  
Dong Si ◽  
Miao Sun ◽  
Renzhi Cao

AbstractQuality Assessment (QA) plays an important role in protein structure prediction. Traditional protein QA methods suffer from searching databases or comparing with other models for making predictions, which usually fail. We propose a novel protein single-model QA method which is built on a new representation that converts raw atom information into a series of carbon-alpha (Cα) atoms with side-chain information, defined by their dihedral angles and bond lengths to the prior residue. An LSTM network is used to predict the quality by treating each amino acid as a time-step and consider the final value returned by the LSTM cells. To the best of our knowledge, this is the first time anyone has attempted to use an LSTM model on the QA problem; furthermore, we use a new representation which has not been studied for QA. In addition to angles, we make use of sequence properties like secondary structure at each time-step, without using any database. Our model achieves an overall correlation of 0.651 on the CASP12 testing dataset. Our experiment points out new directions for QA problem and our method could be widely used for protein structure prediction problem. The software is freely available at GitHub:https://github.com/caorenzhi/AngularQA


Author(s):  
Lewis Moffat ◽  
Joe G. Greener ◽  
David T. Jones

AbstractThe prediction of protein structure and the design of novel protein sequences and structures have long been intertwined. The recently released AlphaFold has heralded a new generation of accurate protein structure prediction, but the extent to which this affects protein design stands yet unexplored. Here we develop a rapid and effective approach for fixed backbone computational protein design, leveraging the predictive power of AlphaFold. For several designs we demonstrate that not only are the AlphaFold predicted structures in agreement with the desired backbones, but they are also supported by the structure predictions of other supervised methods as well as ab initio folding. These results suggest that AlphaFold, and methods like it, are able to facilitate the development of a new range of novel and accurate protein design methodologies.


Sign in / Sign up

Export Citation Format

Share Document