scholarly journals Deep convolutional neural networks for predicting the quality of single protein structural models

2019 ◽  
Author(s):  
Jie Hou ◽  
Renzhi Cao ◽  
Jianlin Cheng

AbstractPredicting the global quality and local (residual-specific) quality of a single protein structural model is important for protein structure prediction and application. In this work, we developed a deep one-dimensional convolutional neural network (1DCNN) that predicts the absolute local quality of a single protein model as well as two 1DCNNs to predict both local and global quality simultaneously through a novel multi-task learning framework. The networks accept sequential and structural features (i.e. amino acid sequence, agreement of secondary structure and solvent accessibilities, residual disorder properties and Rosetta energies) of a protein model of any size as input to predict its quality, which is different from existing methods using a fixed number of hand-crafted features as input. Our three methods (InteractQA-net, JointQA-net and LocalQA-net) were trained on the structural models of the single-domain protein targets of CASP8, 9, 10 and evaluated on the models of CASP11 and CASP12 targets. The results show that the performance of our deep learning methods is comparable to the state-of-the-art quality assessment methods. Our study also demonstrates that combining local and global quality predictions together improves the global quality prediction accuracy. The source code and executable of our methods are available at:https://github.com/multicom-toolbox/DeepCovQA

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Xiao Chen ◽  
Jian Liu ◽  
Zhiye Guo ◽  
Tianqi Wu ◽  
Jie Hou ◽  
...  

AbstractThe inter-residue contact prediction and deep learning showed the promise to improve the estimation of protein model accuracy (EMA) in the 13th Critical Assessment of Protein Structure Prediction (CASP13). To further leverage the improved inter-residue distance predictions to enhance EMA, during the 2020 CASP14 experiment, we integrated several new inter-residue distance features with the existing model quality assessment features in several deep learning methods to predict the quality of protein structural models. According to the evaluation of performance in selecting the best model from the models of CASP14 targets, our three multi-model predictors of estimating model accuracy (MULTICOM-CONSTRUCT, MULTICOM-AI, and MULTICOM-CLUSTER) achieve the averaged loss of 0.073, 0.079, and 0.081, respectively, in terms of the global distance test score (GDT-TS). The three methods are ranked first, second, and third out of all 68 CASP14 predictors. MULTICOM-DEEP, the single-model predictor of estimating model accuracy (EMA), is ranked within top 10 among all the single-model EMA methods according to GDT-TS score loss. The results demonstrate that inter-residue distance features are valuable inputs for deep learning to predict the quality of protein structural models. However, larger training datasets and better ways of leveraging inter-residue distance information are needed to fully explore its potentials.


2021 ◽  
Author(s):  
Xiao Chen ◽  
Jianling Cheng

AbstractBackgroundEstimation of the accuracy (quality) of protein structural models is important for both prediction and use of protein structural models. Deep learning methods have been used to integrate protein structure features to predict the quality of protein models. Inter-residue distances are key information for predicting protein’s tertiary structures and therefore have good potentials to predict the quality of protein structural models. However, few methods have been developed to fully take advantage of predicted inter-residue distance maps to estimate the accuracy of a single protein structural model.ResultWe developed an attentive 2D convolutional neural network (CNN) with channel-wise attention to take only a raw difference map between the inter-residue distance map calculated from a single protein model and the distance map predicted from the protein sequence as input to predict the quality of the model. The network comprises multiple convolutional layers, batch normalization layers, dense layers, and Squeeze-and-Excitation blocks with attention to automatically extract features relevant to protein model quality from the raw input without using any expert-curated features. We evaluated DISTEMA’s capability of selecting the best models for CASP13 targets in terms of ranking loss of GDT-TS score. The ranking loss of DISTEMA is 0.079, lower than several state-of-the-art single-model quality assessment methods. The work demonstrates that using raw inter-residue distance information alone with deep learning can predict the quality of protein structural models reasonably well.


2021 ◽  
Author(s):  
Kyle Hippe ◽  
Cade Lilley ◽  
William Berkenpas ◽  
Kiyomi Kishaba ◽  
Renzhi Cao

ABSTRACTMotivationThe Estimation of Model Accuracy problem is a cornerstone problem in the field of Bioinformatics. When predictions are made for proteins of which we do not know the native structure, we run into an issue to tell how good a tertiary structure prediction is, especially the protein binding regions, which are useful for drug discovery. Currently, most methods only evaluate the overall quality of a protein decoy, and few can work on residue level and protein complex. Here we introduce ZoomQA, a novel, single-model method for assessing the accuracy of a tertiary protein structure / complex prediction at residue level. ZoomQA differs from others by considering the change in chemical and physical features of a fragment structure (a portion of a protein within a radius r of the target amino acid) as the radius of contact increases. Fourteen physical and chemical properties of amino acids are used to build a comprehensive representation of every residue within a protein and grades their placement within the protein as a whole. Moreover, ZoomQA can evaluate the quality of protein complex, which is unique.ResultsWe benchmark ZoomQA on CASP14, it outperforms other state of the art local QA methods and rivals state of the art QA methods in global prediction metrics. Our experiment shows the efficacy of these new features, and shows our method is able to match the performance of other state-of-the-art methods without the use of homology searching against database or PSSM matrix.Availabilityhttp://[email protected]


2020 ◽  
Author(s):  
Jianheng Liu ◽  
Tao Huang ◽  
Yusen Zhang ◽  
Tianxuan Zhao ◽  
Xueni Zhao ◽  
...  

Abstract mRNA m5C, which has recently been implicated in the regulation of mRNA mobility, metabolism, and translation, plays important regulatory roles in various biological events. Two types of m5C sites are found in mRNAs. Type I m5C sites, which contain a downstream G-rich triplet motif and are computationally predicted to locate in the 5’ end of putative hairpin structures, are methylated by NSUN2. Type II m5C sites contain a downstream UCCA motif and are computationally predicted to locate in the loops of putative hairpin structures. However, their biogenesis remains unknown. Here we identified NSUN6, a methyltransferase that is known to methylate C72 of tRNAThr and tRNACys, as an mRNA methyltransferase that targets Type II m5C sites. Combining the RNA secondary structure prediction, miCLIP, and results from a high-throughput mutagenesis analysis, we determined the RNA sequence and structural features governing the specificity of NSUN6-mediated mRNA methylation. Integrating these features into an NSUN6-RNA structural model, we identified an NSUN6 variant that largely loses tRNA methylation but retains mRNA methylation ability. Finally, we revealed a weak negative correlation between m5C methylation and translation efficiency. Our findings uncover that mRNA m5C is tightly controlled by an elaborate two-enzyme system, and the protein-RNA structure analysis strategy established may be applied to other RNA modification writers to distinguish the functions of different RNA substrates of a writer protein.


2018 ◽  
Vol 6 (1) ◽  
Author(s):  
Lena Erdawati

The purpose of the study is to analyze how much influence the quality of information and the understanding of accounting on the quality of financial statements on small and medium business (SMEs) in Tangerang Regency. Respondents include owners/managers of SMEs as many as 54 people. The sampling technique uses sensus sampling. Data collection using questionnaire. The method used in this research is the method of verification to determine the effect of quality of information and understanding of accounting on the quality of financial statements. The test statistic used is designing the structural model, designing a measurement model, construct the path diagram, test the model fit. Suitability test structural models and hypotheses using software SmartPLS 3.0. The results showed that the quality of information and the understanding of accounting have a significant effect on the quality of financial statementsTujuan penelitian ini untuk menganalisis seberapa besar pengaruh kualitas informasi dan pemahaman akuntansi terhadap kualitas laporan keuangan pada usaha kecil dan menengah (UKM) di Kabupaten Tangerang. Responden sebanyak 54 orang setingkat pemilik / pengelola UKM. Teknik pengambilan sampel menggunakan sensus. Pengumpulan data menggunakan kuesioner. Metode yang digunakan dalam penelitian ini adalah metode verifikasi untuk mengetahui pengaruh kualitas informasi dan pemahaman akuntansi terhadap kualitas laporan keuangan. Statistik uji yang digunakan adalah model struktural, model pengukuran, diagram alur, kesesuaian model. Uji kesesuaian model struktural dan hipotesis menggunakan perangkat lunak SmartPLS 3.0. Hasil penelitian menunjukkan bahwa kualitas informasi dan pemahaman akuntansi memiliki pengaruh yang signifikan terhadap kualitas laporan keuangan.


2021 ◽  
Author(s):  
Xiao Chen ◽  
Jian Liu ◽  
Zhiye Guo ◽  
Tianqi Wu ◽  
Jie Hou ◽  
...  

AbstractThe inter-residue contact prediction and deep learning showed the promise to improve the estimation of protein model accuracy (EMA) in the 13th Critical Assessment of Protein Structure Prediction (CASP13). During the 2020 CASP14 experiment, we developed and tested several EMA predictors that used deep learning with the new features based on inter-residue distance/contact predictions as well as the existing model quality features. The average global distance test (GDT-TS) score loss of ranking CASP14 structural models by three multi-model MULTICOM EMA predictors (MULTICOM-CONSTRUCT, MULTICOM-AI, and MULTICOM-CLUSTER) is 0.073, 0.079, and 0.081, respectively, which are ranked first, second, and third places out of 68 CASP14 EMA predictors. The single-model EMA predictor (MULTICOM-DEEP) is ranked 10th place among all the single-model EMA methods in terms of GDT_TS score loss. The results show that deep learning and contact/distance predictions are useful in ranking and selecting protein structural models.


2020 ◽  
Author(s):  
Jianheng Liu ◽  
Tao Huang ◽  
Yusen Zhang ◽  
Tianxuan Zhao ◽  
Xueni Zhao ◽  
...  

AbstractmRNA m5C, which has recently been implicated in the regulation of mRNA mobility, metabolism, and translation, plays important regulatory roles in various biological events. Two types of m5C sites are found in mRNAs. Type I m5C sites, which contain a downstream G-rich triplet motif and are computationally predicted to locate in the 5’ end of putative hairpin structures, are methylated by NSUN2. Type II m5C sites contain a downstream UCCA motif and are computationally predicted to locate in the loops of putative hairpin structures. However, their biogenesis remains unknown. Here we identified NSUN6, a methyltransferase that is known to methylate C72 of tRNAThr and tRNACys, as an mRNA methyltransferase that targets Type II m5C sites. Combining the RNA secondary structure prediction, miCLIP, and results from a high-throughput mutagenesis analysis, we determined the RNA sequence and structural features governing the specificity of NSUN6-mediated mRNA methylation. Integrating these features into an NSUN6-RNA structural model, we identified an NSUN6 variant that largely loses tRNA methylation but retains mRNA methylation ability. Finally, we revealed a negative correlation between m5C methylation and translation efficiency. Our findings uncover that mRNA m5C is tightly controlled by an elaborate two-enzyme system, and the protein-RNA structure analysis strategy established may be applied to other RNA modification writers to distinguish the functions of different RNA substrates of a writer protein.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e8408
Author(s):  
Akhila Melarkode Vattekatte ◽  
Nicolas Ken Shinada ◽  
Tarun J. Narwani ◽  
Floriane Noël ◽  
Olivier Bertrand ◽  
...  

Antigen binding by antibodies requires precise orientation of the complementarity- determining region (CDR) loops in the variable domain to establish the correct contact surface. Members of the family Camelidae have a modified form of immunoglobulin gamma (IgG) with only heavy chains, called Heavy Chain only Antibodies (HCAb). Antigen binding in HCAbs is mediated by only three CDR loops from the single variable domain (VHH) at the N-terminus of each heavy chain. This feature of the VHH, along with their other important features, e.g., easy expression, small size, thermo-stability and hydrophilicity, made them promising candidates for therapeutics and diagnostics. Thus, to design better VHH domains, it is important to thoroughly understand their sequence and structure characteristics and relationship. In this study, sequence characteristics of VHH domains have been analysed in depth, along with their structural features using innovative approaches, namely a structural alphabet. An elaborate summary of various studies proposing structural models of VHH domains showed diversity in the algorithms used. Finally, a case study to elucidate the differences in structural models from single and multiple templates is presented. In this case study, along with the above-mentioned aspects of VHH, an exciting view of various factors in structure prediction of VHH, like template framework selection, is also discussed.


Sign in / Sign up

Export Citation Format

Share Document