scholarly journals Modeling mutational effects on biochemical phenotypes using convolutional neural networks: application to SARS-CoV-2

2021 ◽  
Author(s):  
Bo Wang ◽  
Eric R. Gamazon

ABSTRACTBiochemical phenotypes are major indexes for protein structure and function characterization. They are determined, at least in part, by the intrinsic physicochemical properties of amino acids and may be reflected in the protein three-dimensional structure. Modeling mutational effects on biochemical phenotypes is a critical step for understanding protein function and disease mechanism as well as enabling drug discovery. Deep Mutational Scanning (DMS) experiments have been performed on SARS-CoV-2’s spike receptor binding domain and the human ACE2 zinc-binding peptidase domain – both central players in viral infection and evolution and antibody evasion - quantifying how mutations impact binding affinity and protein expression. Here, we modeled biochemical phenotypes from massively parallel assays, using convolutional neural networks trained on protein sequence mutations in the virus and human host. We found that neural networks are significantly predictive of binding affinity, protein expression, and antibody escape, learning complex interactions and higher-order features that are difficult to capture with conventional methods from structural biology. Integrating the intrinsic physicochemical properties of amino acids, including hydrophobicity, solvent-accessible surface area, and long-range non-bonded energy per atom, significantly improved prediction (empirical p<0.01) though there was such a strong dependence on the sequence data alone to yield reasonably good prediction. We observed concordance of the DMS data and our neural network predictions with an independent study on intermolecular interactions from molecular dynamics (multiple 500 ns or 1 μs all-atom) simulations of the spike protein-ACE2 interface, with critical implications for the use of deep learning to dissect molecular mechanisms. The mutation- or genetically-determined component of a biochemical phenotype estimated from the neural networks has improved causal inference properties relative to the original phenotype and can facilitate crucial insights into disease pathophysiology and therapeutic design.

2021 ◽  
Vol 17 ◽  
pp. 439-460
Author(s):  
Vladimir Kubyshkin ◽  
Rebecca Davis ◽  
Nediljko Budisa

Due to the heterocyclic structure and distinct conformational profile, proline is unique in the repertoire of the 20 amino acids coded into proteins. Here, we summarize the biochemical work on the replacement of proline with (4R)- and (4S)-fluoroproline as well as 4,4-difluoroproline in proteins done mainly in the last two decades. We first recapitulate the complex position and biochemical fate of proline in the biochemistry of a cell, discuss the physicochemical properties of fluoroprolines, and overview the attempts to use these amino acids as proline replacements in studies of protein production and folding. Fluorinated proline replacements are able to elevate the protein expression speed and yields and improve the thermodynamic and kinetic folding profiles of individual proteins. In this context, fluoroprolines can be viewed as useful tools in the biotechnological toolbox. As a prospect, we envision that proteome-wide proline-to-fluoroproline substitutions could be possible. We suggest a hypothetical scenario for the use of laboratory evolutionary methods with fluoroprolines as a suitable vehicle to introduce fluorine into living cells. This approach may enable creation of synthetic cells endowed with artificial biodiversity, containing fluorine as a bioelement.


2017 ◽  
Author(s):  
Jianjun Hu ◽  
Zhonghao Liu

AbstractConvolutional neural networks (CNN) have been shown to outperform conventional methods in DNA-protien binding specificity prediction. However, whether we can transfer this success to protien-peptide binding affinity prediction depends on appropriate design of the CNN architectue that calls for thorough understanding how to match the architecture to the problem. Here we propose DeepMHC, a deep convolutional neural network (CNN) based protein-peptide binding prediction algorithm for achieving better performance in MHC-I peptide binding affinity prediction than conventional algorithms. Our model takes only raw binding peptide sequences as input without needing any human-designed features and othe physichochemical or evolutionary information of the amino acids. Our CNN models are shown to be able to learn non-linear relationships among the amino acid positions of the peptides to achieve highly competitive performance on most of the IEDB benchmark datasets with a single model architecture and without using any consensus or composite ensemble classifier models. By systematically exploring the best CNN architecture, we identified critical design considerations in CNN architecture development for peptide-MHC binding prediction.


2020 ◽  
Vol 2020 (10) ◽  
pp. 28-1-28-7 ◽  
Author(s):  
Kazuki Endo ◽  
Masayuki Tanaka ◽  
Masatoshi Okutomi

Classification of degraded images is very important in practice because images are usually degraded by compression, noise, blurring, etc. Nevertheless, most of the research in image classification only focuses on clean images without any degradation. Some papers have already proposed deep convolutional neural networks composed of an image restoration network and a classification network to classify degraded images. This paper proposes an alternative approach in which we use a degraded image and an additional degradation parameter for classification. The proposed classification network has two inputs which are the degraded image and the degradation parameter. The estimation network of degradation parameters is also incorporated if degradation parameters of degraded images are unknown. The experimental results showed that the proposed method outperforms a straightforward approach where the classification network is trained with degraded images only.


Sign in / Sign up

Export Citation Format

Share Document