scholarly journals AlphaDesign: A de novo protein design framework based on AlphaFold

2021 ◽  
Author(s):  
Michael Jendrusch ◽  
Jan O. Korbel ◽  
S. Kashif Sadiq

De novo protein design is a longstanding fundamental goal of synthetic biology, but has been hindered by the difficulty in reliable prediction of accurate high-resolution protein structures from sequence. Recent advances in the accuracy of protein structure prediction methods, such as AlphaFold (AF), have facilitated proteome scale structural predictions of monomeric proteins. Here we develop AlphaDesign, a computational framework for de novo protein design that embeds AF as an oracle within an optimisable design process. Our framework enables rapid prediction of completely novel protein monomers starting from random sequences. These are shown to adopt a diverse array of folds within the known protein space. A recent and unexpected utility of AF to predict the structure of protein complexes, further allows our framework to design higher-order complexes. Subsequently a range of predictions are made for monomers, homodimers, heterodimers as well as higher-order homo-oligomers -trimers to hexamers. Our analyses also show potential for designing proteins that bind to a pre-specified target protein. Structural integrity of predicted structures is validated and confirmed by standard ab initio folding and structural analysis methods as well as more extensively by performing rigorous all-atom molecular dynamics simulations and analysing the corresponding structural flexibility, intramonomer and interfacial amino-acid contacts. These analyses demonstrate widespread maintenance of structural integrity and suggests that our framework allows for fairly accurate protein design. Strikingly, our approach also reveals the capacity of AF to predict proteins that switch conformation upon complex formation, such as involving switches from α-helices to β-sheets during amyloid filament formation. Correspondingly, when integrated into our design framework, our approach reveals de novo design of a subset of proteins that switch conformation between monomeric and oligomeric state.

2018 ◽  
Author(s):  
Jianfu Zhou ◽  
Alexandra E. Panaitiu ◽  
Gevorg Grigoryan

AbstractThe ability to routinely design functional proteins, in a targeted manner, would have enormous implications for biomedical research and therapeutic development. Computational protein design (CPD) offers the potential to fulfill this need, and though recent years have brought considerable progress in the field, major limitations remain. Current state-of-the-art approaches to CPD aim to capture the determinants of structure from physical principles. While this has led to many successful designs, it does have strong limitations associated with inaccuracies in physical modeling, such that a robust general solution to CPD has yet to be found. Here we propose a fundamentally novel design framework—one based on identifying and applying patterns of sequence-structure compatibility found in known proteins, rather than approximating them from models of inter-atomic interactions. Specifically, we systematically decompose the target structure to be designed into structural building blocks we call TERMs (tertiary motifs) and use rapid structure search against the Protein Data Bank (PDB) to identify sequence patterns associated with each TERM from known protein structures that contain it. These results are then combined to produce a sequence-level pseudo-energy model that can score any sequence for compatibility with the target structure. This model can then be used to extract the optimal-scoring sequence via combinatorial optimization or otherwise sample the sequence space predicted to be well compatible with folding to the target. Here we carry out extensive computational analyses, showing that our method, which we dub dTERMen (design with TERM energies): 1) produces native-like sequences given native crystallographic or NMR backbones, 2) produces sequence-structure compatibility scores that correlate with thermodynamic stability, and 3) is able to predict experimental success of designed sequences generated with other methods, and 4) designs sequences that are found to fold to the desired target by structure prediction more frequently than sequences designed with an atomistic method. As an experimental validation of dTERMen, we perform a total surface redesign of Red Fluorescent Protein mCherry, marking a total of 64 residues as variable. The single sequence identified as optimal by dTERMen harbors 48 mutations relative to mCherry, but nevertheless folds, is monomeric in solution, exhibits similar stability to chemical denaturation as mCherry, and even preserves the fluorescence property. Our results strongly argue that the PDB is now sufficiently large to enable proteins to be designed by using only examples of structural motifs from unrelated proteins. This is highly significant, given that the structural database will only continue to grow, and signals the possibility of a whole host of novel data-driven CPD methods. Because such methods are likely to have orthogonal strengths relative to existing techniques, they could represent an important step towards removing remaining barriers to robust CPD.


Author(s):  
Hyeonuk Woo ◽  
Sang-Jun Park ◽  
Yeol Kyo Choi ◽  
Taeyong Park ◽  
Maham Tanveer ◽  
...  

ABSTRACTThis technical study describes all-atom modeling and simulation of a fully-glycosylated full-length SARS-CoV-2 spike (S) protein in a viral membrane. First, starting from PDB:6VSB and 6VXX, full-length S protein structures were modeled using template-based modeling, de-novo protein structure prediction, and loop modeling techniques in GALAXY modeling suite. Then, using the recently-determined most occupied glycoforms, 22 N-glycans and 1 O-glycan of each monomer were modeled using Glycan Reader & Modeler in CHARMM-GUI. These fully-glycosylated full-length S protein model structures were assessed and further refined against the low-resolution data in their respective experimental maps using ISOLDE. We then used CHARMM-GUI Membrane Builder to place the S proteins in a viral membrane and performed all-atom molecular dynamics simulations. All structures are available in CHARMM-GUI COVID-19 Archive (http://www.charmm-gui.org/docs/archive/covid19), so researchers can use these models to carry out innovative and novel modeling and simulation research for the prevention and treatment of COVID-19.


Author(s):  
Ivan Anishchenko ◽  
Tamuka M. Chidyausiku ◽  
Sergey Ovchinnikov ◽  
Samuel J. Pellock ◽  
David Baker

AbstractThere has been considerable recent progress in protein structure prediction using deep neural networks to infer distance constraints from amino acid residue co-evolution1–3. We investigated whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occuring proteins used in training the models. We generated random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting distance maps, which as expected are quite featureless. We then carried out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (KL-divergence) between the distance distributions predicted by the network and the background distribution. Optimization from different random starting points resulted in a wide range of proteins with diverse sequences and all alpha, all beta sheet, and mixed alpha-beta structures. We obtained synthetic genes encoding 129 of these network hallucinated sequences, expressed and purified the proteins in E coli, and found that 27 folded to monomeric stable structures with circular dichroism spectra consistent with the hallucinated structures. Thus deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute, alongside traditional physically based models, to the de novo design of proteins with new functions.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shin Irumagawa ◽  
Kaito Kobayashi ◽  
Yutaka Saito ◽  
Takeshi Miyata ◽  
Mitsuo Umetsu ◽  
...  

AbstractThe stability of proteins is an important factor for industrial and medical applications. Improving protein stability is one of the main subjects in protein engineering. In a previous study, we improved the stability of a four-helix bundle dimeric de novo protein (WA20) by five mutations. The stabilised mutant (H26L/G28S/N34L/V71L/E78L, SUWA) showed an extremely high denaturation midpoint temperature (Tm). Although SUWA is a remarkably hyperstable protein, in protein design and engineering, it is an attractive challenge to rationally explore more stable mutants. In this study, we predicted stabilising mutations of WA20 by in silico saturation mutagenesis and molecular dynamics simulation, and experimentally confirmed three stabilising mutations of WA20 (N22A, N22E, and H86K). The stability of a double mutant (N22A/H86K, rationally optimised WA20, ROWA) was greatly improved compared with WA20 (ΔTm = 10.6 °C). The model structures suggested that N22A enhances the stability of the α-helices and N22E and H86K contribute to salt-bridge formation for protein stabilisation. These mutations were also added to SUWA and improved its Tm. Remarkably, the most stable mutant of SUWA (N22E/H86K, rationally optimised SUWA, ROSA) showed the highest Tm (129.0 °C). These new thermostable mutants will be useful as a component of protein nanobuilding blocks to construct supramolecular protein complexes.


2019 ◽  
Author(s):  
Rebecca F. Alford ◽  
Patrick J. Fleming ◽  
Karen G. Fleming ◽  
Jeffrey J. Gray

ABSTRACTProtein design is a powerful tool for elucidating mechanisms of function and engineering new therapeutics and nanotechnologies. While soluble protein design has advanced, membrane protein design remains challenging due to difficulties in modeling the lipid bilayer. In this work, we developed an implicit approach that captures the anisotropic structure, shape of water-filled pores, and nanoscale dimensions of membranes with different lipid compositions. The model improves performance in computational bench-marks against experimental targets including prediction of protein orientations in the bilayer, ΔΔG calculations, native structure dis-crimination, and native sequence recovery. When applied to de novo protein design, this approach designs sequences with an amino acid distribution near the native amino acid distribution in membrane proteins, overcoming a critical flaw in previous membrane models that were prone to generating leucine-rich designs. Further, the proteins designed in the new membrane model exhibit native-like features including interfacial aromatic side chains, hydrophobic lengths compatible with bilayer thickness, and polar pores. Our method advances high-resolution membrane protein structure prediction and design toward tackling key biological questions and engineering challenges.Significance StatementMembrane proteins participate in many life processes including transport, signaling, and catalysis. They constitute over 30% of all proteins and are targets for over 60% of pharmaceuticals. Computational design tools for membrane proteins will transform the interrogation of basic science questions such as membrane protein thermodynamics and the pipeline for engineering new therapeutics and nanotechnologies. Existing tools are either too expensive to compute or rely on manual design strategies. In this work, we developed a fast and accurate method for membrane protein design. The tool is available to the public and will accelerate the experimental design pipeline for membrane proteins.


2020 ◽  
Vol 7 (8) ◽  
pp. 1410-1412
Author(s):  
Weijie Zhao ◽  
Chu Wang

Abstract Search ‘de novo protein design’ on Google and you will find the name David Baker in all results of the first page. Professor David Baker at the University of Washington and other scientists are opening up a new world of fantastic proteins. Protein is the direct executor of most biological functions and its structure and function are fully determined by its primary sequence. Baker's group developed the Rosetta software suite that enabled the computational prediction and design of protein structures. Being able to design proteins from scratch means being able to design executors for diverse purposes and benefit society in multiple ways. Recently, NSR interviewed Prof. Baker on this fast-developing field and his personal experiences.


2014 ◽  
Vol 10 (4) ◽  
Author(s):  
Jaume Bonet ◽  
Andras Fiser ◽  
Baldo Oliva ◽  
Narcis Fernandez-Fuentes

AbstractProtein structures are made up of periodic and aperiodic structural elements (i.e., α-helices, β-strands and loops). Despite the apparent lack of regular structure, loops have specific conformations and play a central role in the folding, dynamics, and function of proteins. In this article, we reviewed our previous works in the study of protein loops as local supersecondary structural motifs or Smotifs. We reexamined our works about the structural classification of loops (ArchDB) and its application to loop structure prediction (ArchPRED), including the assessment of the limits of knowledge-based loop structure prediction methods. We finalized this article by focusing on the modular nature of proteins and how the concept of Smotifs provides a convenient and practical approach to decompose proteins into strings of concatenated Smotifs and how can this be used in computational protein design and protein structure prediction.


2015 ◽  
Vol 33 ◽  
pp. 16-26 ◽  
Author(s):  
Derek N Woolfson ◽  
Gail J Bartlett ◽  
Antony J Burton ◽  
Jack W Heal ◽  
Ai Niitsu ◽  
...  

2016 ◽  
Vol 44 (5) ◽  
pp. 1523-1529 ◽  
Author(s):  
James T. MacDonald ◽  
Paul S. Freemont

The computational algorithms used in the design of artificial proteins have become increasingly sophisticated in recent years, producing a series of remarkable successes. The most dramatic of these is the de novo design of artificial enzymes. The majority of these designs have reused naturally occurring protein structures as ‘scaffolds’ onto which novel functionality can be grafted without having to redesign the backbone structure. The incorporation of backbone flexibility into protein design is a much more computationally challenging problem due to the greatly increased search space, but promises to remove the limitations of reusing natural protein scaffolds. In this review, we outline the principles of computational protein design methods and discuss recent efforts to consider backbone plasticity in the design process.


2021 ◽  
Vol 8 ◽  
Author(s):  
Charles Christoffer ◽  
Vijay Bharadwaj ◽  
Ryan Luu ◽  
Daisuke Kihara

Protein-protein docking is a useful tool for modeling the structures of protein complexes that have yet to be experimentally determined. Understanding the structures of protein complexes is a key component for formulating hypotheses in biophysics regarding the functional mechanisms of complexes. Protein-protein docking is an established technique for cases where the structures of the subunits have been determined. While the number of known structures deposited in the Protein Data Bank is increasing, there are still many cases where the structures of individual proteins that users want to dock are not determined yet. Here, we have integrated the AttentiveDist method for protein structure prediction into our LZerD webserver for protein-protein docking, which enables users to simply submit protein sequences and obtain full-complex atomic models, without having to supply any structure themselves. We have further extended the LZerD docking interface with a symmetrical homodimer mode. The LZerD server is available at https://lzerd.kiharalab.org/.


Sign in / Sign up

Export Citation Format

Share Document