scholarly journals Protein Design with Deep Learning

2021 ◽  
Vol 22 (21) ◽  
pp. 11741
Author(s):  
Marianne Defresne ◽  
Sophie Barbe ◽  
Thomas Schiex

Computational Protein Design (CPD) has produced impressive results for engineering new proteins, resulting in a wide variety of applications. In the past few years, various efforts have aimed at replacing or improving existing design methods using Deep Learning technology to leverage the amount of publicly available protein data. Deep Learning (DL) is a very powerful tool to extract patterns from raw data, provided that data are formatted as mathematical objects and the architecture processing them is well suited to the targeted problem. In the case of protein data, specific representations are needed for both the amino acid sequence and the protein structure in order to capture respectively 1D and 3D information. As no consensus has been reached about the most suitable representations, this review describes the representations used so far, discusses their strengths and weaknesses, and details their associated DL architecture for design and related tasks.

1981 ◽  
Vol 37 (a1) ◽  
pp. C18-C18
Author(s):  
M. J. E. Sternberg ◽  
F. E. Cohen ◽  
W. R. Taylor

2019 ◽  
Author(s):  
Rebecca F. Alford ◽  
Patrick J. Fleming ◽  
Karen G. Fleming ◽  
Jeffrey J. Gray

ABSTRACTProtein design is a powerful tool for elucidating mechanisms of function and engineering new therapeutics and nanotechnologies. While soluble protein design has advanced, membrane protein design remains challenging due to difficulties in modeling the lipid bilayer. In this work, we developed an implicit approach that captures the anisotropic structure, shape of water-filled pores, and nanoscale dimensions of membranes with different lipid compositions. The model improves performance in computational bench-marks against experimental targets including prediction of protein orientations in the bilayer, ΔΔG calculations, native structure dis-crimination, and native sequence recovery. When applied to de novo protein design, this approach designs sequences with an amino acid distribution near the native amino acid distribution in membrane proteins, overcoming a critical flaw in previous membrane models that were prone to generating leucine-rich designs. Further, the proteins designed in the new membrane model exhibit native-like features including interfacial aromatic side chains, hydrophobic lengths compatible with bilayer thickness, and polar pores. Our method advances high-resolution membrane protein structure prediction and design toward tackling key biological questions and engineering challenges.Significance StatementMembrane proteins participate in many life processes including transport, signaling, and catalysis. They constitute over 30% of all proteins and are targets for over 60% of pharmaceuticals. Computational design tools for membrane proteins will transform the interrogation of basic science questions such as membrane protein thermodynamics and the pipeline for engineering new therapeutics and nanotechnologies. Existing tools are either too expensive to compute or rely on manual design strategies. In this work, we developed a fast and accurate method for membrane protein design. The tool is available to the public and will accelerate the experimental design pipeline for membrane proteins.


2016 ◽  
Author(s):  
Eleisha L. Jackson ◽  
Stephanie J. Spielman ◽  
Claus O. Wilke

AbstractProteins evolve through two primary mechanisms: substitution, where mutations alter a protein’s amino-acid sequence, and insertions and deletions (indels), where amino acids are either added to or removed from the sequence. Protein structure has been shown to influence the rate at which substitutions accumulate across sites in proteins, but whether structure similarly constrains the occurrence of indels has not been rigorously studied. Here, we investigate the extent to which structural properties known to covary with protein evolutionary rates might also predict protein tolerance to indels. Specifically, we analyze a publicly available dataset of single–amino-acid deletion mutations in enhanced green fluorescent protein (eGFP) to assess how well the functional effect of deletions can be predicted from protein structure. We find that weighted contact number (WCN), which measures how densely packed a residue is within the protein’s three-dimensional structure, provides the best single predictor for whether eGFP will tolerate a given deletion. We additionally find that using protein design to explicitly model deletions results in improved predictions of functional status when combined with other structural predictors. Our work suggests that structure plays fundamental role in constraining deletions at sites in proteins, and further that similar biophysical constraints influence both substitutions and deletions. This study therefore provides a solid foundation for future work to examine how protein structure influences tolerance of more complex indel events, such as insertions or large deletions.


Author(s):  
Hoseok Choi ◽  
Seokbeen Lim ◽  
Kyeongran Min ◽  
Kyoung-ha Ahn ◽  
Kyoung-Min Lee ◽  
...  

Abstract Objective: With the development in the field of neural networks, Explainable AI (XAI), is being studied to ensure that artificial intelligence models can be explained. There are some attempts to apply neural networks to neuroscientific studies to explain neurophysiological information with high machine learning performances. However, most of those studies have simply visualized features extracted from XAI and seem to lack an active neuroscientific interpretation of those features. In this study, we have tried to actively explain the high-dimensional learning features contained in the neurophysiological information extracted from XAI, compared with the previously reported neuroscientific results. Approach: We designed a deep neural network classifier using 3D information (3D DNN) and a 3D class activation map (3D CAM) to visualize high-dimensional classification features. We used those tools to classify monkey electrocorticogram (ECoG) data obtained from the unimanual and bimanual movement experiment. Main results: The 3D DNN showed better classification accuracy than other machine learning techniques, such as 2D DNN. Unexpectedly, the activation weight in the 3D CAM analysis was high in the ipsilateral motor and somatosensory cortex regions, whereas the gamma-band power was activated in the contralateral areas during unimanual movement, which suggests that the brain signal acquired from the motor cortex contains information about both contralateral movement and ipsilateral movement. Moreover, the hand-movement classification system used critical temporal information at movement onset and offset when classifying bimanual movements. Significance: As far as we know, this is the first study to use high-dimensional neurophysiological information (spatial, spectral, and temporal) with the deep learning method, reconstruct those features, and explain how the neural network works. We expect that our methods can be widely applied and used in neuroscience and electrophysiology research from the point of view of the explainability of XAI as well as its performance.


Sign in / Sign up

Export Citation Format

Share Document