scholarly journals A deep dilated convolutional residual network for predicting interchain contacts of protein homodimers

2021 ◽  
Author(s):  
Raj Shekhor Roy ◽  
Farhan Quadir ◽  
Elham Soltanikazemi ◽  
Jianlin Cheng

Deep learning has revolutionized protein tertiary structure prediction recently. The cutting-edge deep learning methods such as AlphaFold can predict high-accuracy tertiary structures for most individual protein chains. However, the accuracy of predicting quaternary structures of protein complexes consisting of multiple chains is still relatively low due to lack of advanced deep learning methods in the field. Because interchain residue-residue contacts can be used as distance restraints to guide quaternary structure modeling, here we develop a deep dilated convolutional residual network method (DRCon) to predict interchain residue-residue contacts in homodimers from residue-residue co-evolutionary signals derived from multiple sequence alignments of monomers, intrachain residue-residue contacts of monomers extracted from true/predicted tertiary structures or predicted by deep learning, and other sequence and structural features. Tested on three homodimer test datasets (Homo_std dataset, DeepHomo dataset, and CASP14-CAPRI dataset), the precision of DRCon for top L/5 interchain contact predictions (L: length of monomer in a homodimer) is 43.46%, 47.15%, and 24.81% respectively, which is substantially better than two existing deep learning interchain contact prediction methods. Moreover, our experiments demonstrate that using predicted tertiary structure or intrachain contacts of monomers in the unbound state as input, DRCon still performs reasonably well, even though its accuracy is lower than when true tertiary structures in the bound state are used as input. Finally, our case study shows that good interchain contact predictions can be used to build high-accuracy quaternary structure models of homodimers.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Farhan Quadir ◽  
Raj S. Roy ◽  
Randal Halfmann ◽  
Jianlin Cheng

AbstractDeep learning methods that achieved great success in predicting intrachain residue-residue contacts have been applied to predict interchain contacts between proteins. However, these methods require multiple sequence alignments (MSAs) of a pair of interacting proteins (dimers) as input, which are often difficult to obtain because there are not many known protein complexes available to generate MSAs of sufficient depth for a pair of proteins. In recognizing that multiple sequence alignments of a monomer that forms homomultimers contain the co-evolutionary signals of both intrachain and interchain residue pairs in contact, we applied DNCON2 (a deep learning-based protein intrachain residue-residue contact predictor) to predict both intrachain and interchain contacts for homomultimers using multiple sequence alignment (MSA) and other co-evolutionary features of a single monomer followed by discrimination of interchain and intrachain contacts according to the tertiary structure of the monomer. We name this tool DNCON2_Inter. Allowing true-positive predictions within two residue shifts, the best average precision was obtained for the Top-L/10 predictions of 22.9% for homodimers and 17.0% for higher-order homomultimers. In some instances, especially where interchain contact densities are high, DNCON2_Inter predicted interchain contacts with 100% precision. We also developed Con_Complex, a complex structure reconstruction tool that uses predicted contacts to produce the structure of the complex. Using Con_Complex, we show that the predicted contacts can be used to accurately construct the structure of some complexes. Our experiment demonstrates that monomeric multiple sequence alignments can be used with deep learning to predict interchain contacts of homomeric proteins.


2020 ◽  
Author(s):  
Yumeng Yan ◽  
Sheng-You Huang

AbstractProtein-protein interactions play a fundamental role in all cellular processes. Therefore, determining the structure of protein-protein complexes is crucial to understand their molecular mechanisms and develop drugs targeting the protein-protein interactions. Recently, deep learning has led to a breakthrough in intraprotein contact prediction, achieving an unusual high accuracy in recent CASP structure prediction challenges. However, due to the limited number of known homologous protein-protein interactions and the challenge to generate joint multiple sequence alignments (MSA) of two interacting proteins, the advances in inter-protein contact prediction remain limited. Here, we have proposed a deep learning model to predict inter-protein residue-residue contacts across homo-oligomeric protein interfaces, named as DeepHomo, by integrating evolutionary coupling, sequence conservation, distance map, docking pattern, and physic-chemical information of monomers. DeepHomo was extensively tested on both experimentally determined structures and realistic CASP-CAPRI targets. It was shown that DeepHomo achieved a high accuracy of >60% for the top predicted contact and outperformed state-of-the-art direct-coupling analysis (DCA) and machine learning (ML)-based approaches. Integrating predicted contacts into protein docking with blindly predicted monomer structures also significantly improved the docking accuracy. The present study demonstrated the success of DeepHomo in inter-protein contact prediction. It is anticipated that DeepHomo will have a far-reaching implication in the inter-protein contact and structure prediction for protein-protein interactions.


2020 ◽  
Author(s):  
Farhan Quadir ◽  
Raj Roy ◽  
Randal Halfmann ◽  
Jianlin Cheng

AbstractDeep learning methods that achieved great success in predicting intrachain residue-residue contacts have been applied to predict interchain contacts between proteins. However, these methods require multiple sequence alignments (MSAs) of a pair of interacting proteins (dimers) as input, which are often difficult to obtain because there are not many known protein complexes available to generate MSAs of sufficient depth for a pair of proteins. In recognizing that multiple sequence alignments of a monomer that forms homomultimers contain the co-evolutionary signals of both intrachain and interchain residue pairs in contact, we applied DNCON2 (a deep learning-based protein intrachain residue-residue contact predictor) to predict both intrachain and interchain contacts for homomultimers using multiple sequence alignment (MSA) and other co-evolutionary features of a single monomer followed by discrimination of interchain and intrachain contacts according to the tertiary structure of the monomer. Allowing true-positive predictions within two residue shifts, the best average precision was obtained for the Top-L/10 predictions of DNCON2: 22.9% for homodimers, and 17.0% for higher order homomultimers. In some instances, especially where interchain contact densities are high, the approach predicted interchain contacts with 100% precision. We show that the predicted contacts can be used to accurately construct the structure of some complexes. Our experiment demonstrates that monomeric multiple sequence alignments can be used with deep learning to predict interchain contacts of homomeric proteins.


2021 ◽  
Author(s):  
Farhan Quadir ◽  
Raj Roy ◽  
Randal Halfmann ◽  
Jianlin Cheng

Abstract Deep learning methods that achieved great success in predicting intrachain residue-residue contacts have been applied to predict interchain contacts between proteins. However, these methods require multiple sequence alignments (MSAs) of a pair of interacting proteins (dimers) as input, which are often difficult to obtain because there are not many known protein complexes available to generate MSAs of sufficient depth for a pair of proteins. In recognizing that multiple sequence alignments of a monomer that forms homomultimers contain the co-evolutionary signals of both intrachain and interchain residue pairs in contact, we applied DNCON2 (a deep learning-based protein intrachain residue-residue contact predictor) to predict both intrachain and interchain contacts for homomultimers using multiple sequence alignment (MSA) and other co-evolutionary features of a single monomer followed by discrimination of interchain and intrachain contacts according to the tertiary structure of the monomer. We name this tool DNCON2_Inter. Allowing true-positive predictions within two residue shifts, the best average precision was obtained for the Top-L/10 predictions of 22.9% for homodimers, and 17.0% for higher order homomultimers. In some instances, especially where interchain contact densities are high, DNCON2_Inter predicted interchain contacts with 100% precision. We also developed Con_Complex, a complex structure reconstruction tool that uses predicted contacts to produce the structure of the complex. Using Con_Complex, we show that the predicted contacts can be used to accurately construct the structure of some complexes. Our experiment demonstrates that monomeric multiple sequence alignments can be used with deep learning to predict interchain contacts of homomeric proteins.


2019 ◽  
Vol 14 (3) ◽  
pp. 178-189 ◽  
Author(s):  
Xiaoyang Jing ◽  
Qimin Dong ◽  
Ruqian Lu ◽  
Qiwen Dong

Background:Protein inter-residue contacts prediction play an important role in the field of protein structure and function research. As a low-dimensional representation of protein tertiary structure, protein inter-residue contacts could greatly help de novo protein structure prediction methods to reduce the conformational search space. Over the past two decades, various methods have been developed for protein inter-residue contacts prediction.Objective:We provide a comprehensive and systematic review of protein inter-residue contacts prediction methods.Results:Protein inter-residue contacts prediction methods are roughly classified into five categories: correlated mutations methods, machine-learning methods, fusion methods, templatebased methods and 3D model-based methods. In this paper, firstly we describe the common definition of protein inter-residue contacts and show the typical application of protein inter-residue contacts. Then, we present a comprehensive review of the three main categories for protein interresidue contacts prediction: correlated mutations methods, machine-learning methods and fusion methods. Besides, we analyze the constraints for each category. Furthermore, we compare several representative methods on the CASP11 dataset and discuss performances of these methods in detail.Conclusion:Correlated mutations methods achieve better performances for long-range contacts, while the machine-learning method performs well for short-range contacts. Fusion methods could take advantage of the machine-learning and correlated mutations methods. Employing more effective fusion strategy could be helpful to further improve the performances of fusion methods.


2021 ◽  
Vol 25 (3) ◽  
pp. 31-35
Author(s):  
Piotr Więcek ◽  
Dominik Sankowski

The article presents a new algorithm for increasing the resolution of thermal images. For this purpose, the residual network was integrated with the Kernel-Sharing Atrous Convolution (KSAC) image sub-sampling module. A significant reduction in the algorithm’s complexity and shortening the execution time while maintaining high accuracy were achieved. The neural network has been implemented in the PyTorch environment. The results of the proposed new method of increasing the resolution of thermal images with sizes 32 × 24, 160 × 120 and 640 × 480 for scales up to 6 are presented.


2017 ◽  
Author(s):  
Piyush Agrawal ◽  
Sandeep Singh ◽  
Gandharva Nagpal ◽  
Deepti Sethi ◽  
Gajendra P.S. Raghava

AbstractOne of the challenges in the field of structural proteomics is to predict residue-residue contacts in a protein. It is an integral part of CASP competitions due to its importance in the field of structural biology. This manuscript describes RRCPred 2.0 a method participated in CASP12 and predicted residue-residue contact in targets with high precision. In this approach, firstly 150 predicted protein structures were obtained from CASP12 Stage 2 tarball and ranked using clustering-based quality assessment software. Secondly, residue-residue contacts were assigned in top 10 protein structures based on distance between residues. Finally, residue-residue contacts were predicted in target protein based on consensus/average in top 10 predicted structures. This simple approach performs better than most of CASP12 methods in the categories of TBM and TBM/FM. It ranked 1st in following categories; i) TBM domain on list size L/5, ii) TBM/FM domain on list size L/5 and iii) TBM/FM domain on Top 10. These observations indicate that predicted tertiary structure of a protein can be used for predicting residue-residue contacts in protein with high accuracy.


2021 ◽  
Vol 944 (1) ◽  
pp. 012009
Author(s):  
I Ayuningtias ◽  
I Jaya ◽  
M Iqbal

Abstract Yellowfin tuna (Thunnus albacares), mackerel tuna (Euthynnus affinis), and skipjack tuna (Katsuwonus pelamis) have important economic values for the capture fisheries in Indonesia. Activities of identifying these fish and other types of tuna have been done manually, which can lead to errors and ultimately affect statistics, stock estimates, or traceability. The aim of this research is to use deep learning methods in identifying three species of tuna, specifically yellowfin tuna, mackerel tuna, and skipjack tuna. YOLO’s newest model, YOLOv5, was used to identify the fish. The number of epochs that produces the optimum accuracy value for use in the YOLOv5 model is 400. The values for training loss, accuracy, precision, recall and F1-Score when the model is learning with a total of 400 epochs are 0.000253, 95%, 98.1%, 93.9%, and 96%. Based on these results, the three species of tuna can be identified with high accuracy.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Haicang Zhang ◽  
Qi Zhang ◽  
Fusong Ju ◽  
Jianwei Zhu ◽  
Yujuan Gao ◽  
...  

Abstract Background Accurate prediction of inter-residue contacts of a protein is important to calculating its tertiary structure. Analysis of co-evolutionary events among residues has been proved effective in inferring inter-residue contacts. The Markov random field (MRF) technique, although being widely used for contact prediction, suffers from the following dilemma: the actual likelihood function of MRF is accurate but time-consuming to calculate; in contrast, approximations to the actual likelihood, say pseudo-likelihood, are efficient to calculate but inaccurate. Thus, how to achieve both accuracy and efficiency simultaneously remains a challenge. Results In this study, we present such an approach (called clmDCA) for contact prediction. Unlike plmDCA using pseudo-likelihood, i.e., the product of conditional probability of individual residues, our approach uses composite-likelihood, i.e., the product of conditional probability of all residue pairs. Composite likelihood has been theoretically proved as a better approximation to the actual likelihood function than pseudo-likelihood. Meanwhile, composite likelihood is still efficient to maximize, thus ensuring the efficiency of clmDCA. We present comprehensive experiments on popular benchmark datasets, including PSICOV dataset and CASP-11 dataset, to show that: i) clmDCA alone outperforms the existing MRF-based approaches in prediction accuracy. ii) When equipped with deep learning technique for refinement, the prediction accuracy of clmDCA was further significantly improved, suggesting the suitability of clmDCA for subsequent refinement procedure. We further present a successful application of the predicted contacts to accurately build tertiary structures for proteins in the PSICOV dataset. Conclusions Composite likelihood maximization algorithm can efficiently estimate the parameters of Markov Random Fields and can improve the prediction accuracy of protein inter-residue contacts.


2019 ◽  
Author(s):  
Claudio Bassot ◽  
Arne Elofsson

AbstractRepeat proteins are an abundant class in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these families, the structure is not known. Recently, it has been shown that the structure of many protein families can be predicted by using contact predictions from direct coupling analysis and deep learning. However, their unique sequence features present in repeat proteins is a challenge for contact predictions DCA-methods. Here, we show that using the deep learning-based PconsC4 is more effective for predicting both intra and interunit contacts among a comprehensive set of repeat proteins. In a benchmark dataset of 819 repeat proteins about one third can be correctly modelled and among 51 PFAM families lacking a protein structure, we produce models of five families with estimated high accuracy.Author SummaryRepeat proteins are widespread among organisms and particularly abundant in eukaryotic proteomes. Their primary sequence present repetition in the amino acid sequences that origin structures with repeated folds/domains. Although the repeated units are easy to be recognized in primary sequence, often structure information are missing. Here we used contact prediction for predicting the structure of repeats protein directly from their primary sequences. We benchmark our method on a dataset comprehensive of all the known repeated structures. We evaluate the contact predictions and the obtained models set for different classes of proteins and different lengths of the target, and we benchmark the quality assessment of the models on repeats proteins. Finally, we applied the methods on the repeat PFAM families missing of resolved structures, five of them modelled with high accuracy.


Sign in / Sign up

Export Citation Format

Share Document