scholarly journals Structure-based protein design with deep learning

2021 ◽  
Vol 65 ◽  
pp. 136-144
Author(s):  
Sergey Ovchinnikov ◽  
Po-Ssu Huang
Keyword(s):  
Antibodies ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 12 ◽  
Author(s):  
Jordan Graves ◽  
Jacob Byerly ◽  
Eduardo Priego ◽  
Naren Makkapati ◽  
S. Vince Parish ◽  
...  

Driven by its successes across domains such as computer vision and natural language processing, deep learning has recently entered the field of biology by aiding in cellular image classification, finding genomic connections, and advancing drug discovery. In drug discovery and protein engineering, a major goal is to design a molecule that will perform a useful function as a therapeutic drug. Typically, the focus has been on small molecules, but new approaches have been developed to apply these same principles of deep learning to biologics, such as antibodies. Here we give a brief background of deep learning as it applies to antibody drug development, and an in-depth explanation of several deep learning algorithms that have been proposed to solve aspects of both protein design in general, and antibody design in particular.


Molecules ◽  
2020 ◽  
Vol 25 (14) ◽  
pp. 3250 ◽  
Author(s):  
Eugene Lin ◽  
Chieh-Hsin Lin ◽  
Hsien-Yuan Lane

A growing body of evidence now suggests that artificial intelligence and machine learning techniques can serve as an indispensable foundation for the process of drug design and discovery. In light of latest advancements in computing technologies, deep learning algorithms are being created during the development of clinically useful drugs for treatment of a number of diseases. In this review, we focus on the latest developments for three particular arenas in drug design and discovery research using deep learning approaches, such as generative adversarial network (GAN) frameworks. Firstly, we review drug design and discovery studies that leverage various GAN techniques to assess one main application such as molecular de novo design in drug design and discovery. In addition, we describe various GAN models to fulfill the dimension reduction task of single-cell data in the preclinical stage of the drug development pipeline. Furthermore, we depict several studies in de novo peptide and protein design using GAN frameworks. Moreover, we outline the limitations in regard to the previous drug design and discovery studies using GAN models. Finally, we present a discussion of directions and challenges for future research.


Author(s):  
Namrata Anand-Achim ◽  
Raphael R. Eguchi ◽  
Alexander Derry ◽  
Russ B. Altman ◽  
Po-Ssu Huang

AbstractThe primary challenge of fixed-backbone protein design is to find a distribution of sequences that fold to the backbone of interest. This task is central to nearly all protein engineering problems, as achieving a particular backbone conformation is often a prerequisite for hosting specific functions. In this study, we investigate the capability of a deep neural network to learn the requisite patterns needed to design sequences. The trained model serves as a potential function defined over the space of amino acid identities and rotamer states, conditioned on the local chemical environment at each residue. While most deep learning based methods for sequence design only produce amino acid sequences, our method generates full-atom structural models, which can be evaluated using established sequence quality metrics. Under these metrics we are able to produce realistic and variable designs with quality comparable to the state-of-the-art. Additionally, we experimentally test designs for a de novo TIM-barrel structure and find designs that fold, demonstrating the algorithm’s generalizability to novel structures. Overall, our results demonstrate that a deep learning model can match state-of-the-art energy functions for guiding protein design.SignificanceProtein design tasks typically depend on carefully modeled and parameterized heuristic energy functions. In this study, we propose a novel machine learning method for fixed-backbone protein sequence design, using a learned neural network potential to not only design the sequence of amino acids but also select their side-chain configurations, or rotamers. Factoring through a structural representation of the protein, the network generates designs on par with the state-of-the-art, despite having been entirely learned from data. These results indicate an exciting future for protein design driven by machine learning.


Author(s):  
Jinbo Xu ◽  
Matthew Mcpartlon ◽  
Jin Li

We describe our latest study of the deep convolutional residual neural networks (ResNet) for protein structure prediction, including deeper and wider ResNets, the efficacy of different input features, and improved 3D model building methods. Our ResNet can predict correct folds (TMscore>0.5) for 26 out of 32 CASP13 FM (template-free-modeling) targets and L/5 long-range contacts for these targets with precision over 80%, a significant improvement over the CASP13 results. Although co-evolution analysis plays an important role in the most successful structure prediction methods, we show that when co-evolution is not used, our ResNet can still predict correct folds for 18 of the 32 CASP13 FM targets including several large ones. This marks a significant improvement over the top co-evolution-based, non-deep learning methods at CASP13, and other non-coevolution-based deep learning models, such as the popular recurrent geometric network (RGN). With only primary sequence, our ResNet can also predict correct folds for all 21 human-designed proteins we tested. In contrast, RGN predicts correct folds for only 3 human-designed proteins and zero CASP13 FM target. In addition, we find that ResNet may fare better for the human-designed proteins when trained without co-evolution information than with co-evolution. These results suggest that ResNet does not simply denoise co-evolution signals, but instead is able to learn important sequence-structure relationship from experimental structures. This has important implications on protein design and engineering especially when evolutionary information is not available.


2018 ◽  
Author(s):  
Mohammed AlQuraishi

AbstractPredicting protein structure from sequence is a central challenge of biochemistry. Co‐evolution methods show promise, but an explicit sequence‐to‐structure map remains elusive. Advances in deep learning that replace complex, human‐designed pipelines with differentiable models optimized end‐to‐end suggest the potential benefits of similarly reformulating structure prediction. Here we report the first end‐to‐end differentiable model of protein structure. The model couples local and global protein structure via geometric units that optimize global geometry without violating local covalent chemistry. We test our model using two challenging tasks: predicting novel folds without co‐evolutionary data and predicting known folds without structural templates. In the first task the model achieves state‐of‐the‐art accuracy and in the second it comes within 1‐2Å; competing methods using co‐evolution and experimental templates have been refined over many years and it is likely that the differentiable approach has substantial room for further improvement, with applications ranging from drug discovery to protein design.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Jingxue Wang ◽  
Huali Cao ◽  
John Z. H. Zhang ◽  
Yifei Qi

2021 ◽  
Vol 22 (21) ◽  
pp. 11741
Author(s):  
Marianne Defresne ◽  
Sophie Barbe ◽  
Thomas Schiex

Computational Protein Design (CPD) has produced impressive results for engineering new proteins, resulting in a wide variety of applications. In the past few years, various efforts have aimed at replacing or improving existing design methods using Deep Learning technology to leverage the amount of publicly available protein data. Deep Learning (DL) is a very powerful tool to extract patterns from raw data, provided that data are formatted as mathematical objects and the architecture processing them is well suited to the targeted problem. In the case of protein data, specific representations are needed for both the amino acid sequence and the protein structure in order to capture respectively 1D and 3D information. As no consensus has been reached about the most suitable representations, this review describes the representations used so far, discusses their strengths and weaknesses, and details their associated DL architecture for design and related tasks.


Author(s):  
Stellan Ohlsson
Keyword(s):  

Nature ◽  
2009 ◽  
Author(s):  
Erika Check Hayden
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document