DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks

2019 ◽  
Vol 21 (5) ◽  
pp. 1733-1741 ◽  
Author(s):  
Bin Liu ◽  
Chen-Chen Li ◽  
Ke Yan

Abstract Protein fold recognition is critical for studying the structures and functions of proteins. The existing protein fold recognition approaches failed to efficiently calculate the pairwise sequence similarity scores of the proteins in the same fold sharing low sequence similarities. Furthermore, the existing feature vectorization strategies are not able to measure the global relationships among proteins from different protein folds. In this article, we proposed a new computational predictor called DeepSVM-fold for protein fold recognition by introducing a new feature vector based on the pairwise sequence similarity scores calculated from the fold-specific features extracted by deep learning networks. The feature vectors are then fed into a support vector machine to construct the predictor. Experimental results on the benchmark dataset (LE) show that DeepSVM-fold obviously outperforms all the other competing methods.

2015 ◽  
Vol 5 (1) ◽  
Author(s):  
Taeho Jo ◽  
Jie Hou ◽  
Jesse Eickholt ◽  
Jianlin Cheng

2019 ◽  
Author(s):  
Mohammad Saleh Refahi ◽  
A. Mir ◽  
Jalal A. Nasiri

AbstractProtein fold recognition plays a crucial role in discovering three-dimensional structure of proteins and protein functions. Several approaches have been employed for the prediction of protein folds. Some of these approaches are based on extracting features from protein sequences and using a strong classifier. Feature extraction techniques generally utilize syntactical-based information, evolutionary-based information and physiochemical-based information to extract features. In recent years, Finding an efficient technique for integrating discriminate features have been received advancing attention. In this study, we integrate Auto-Cross-Covariance (ACC) and Separated dimer (SD) evolutionary feature extraction methods. The results features are scored by Information gain (IG) to define and select several discriminated features. According to three benchmark datasets, DD, RDD and EDD, the results of the support vector machine (SVM) show more than 6% improvement in accuracy on these benchmark datasets.


2019 ◽  
Vol 21 (6) ◽  
pp. 2133-2141 ◽  
Author(s):  
Chen-Chen Li ◽  
Bin Liu

Abstract Protein fold recognition is one of the most critical tasks to explore the structures and functions of the proteins based on their primary sequence information. The existing protein fold recognition approaches rely on features reflecting the characteristics of protein folds. However, the feature extraction methods are still the bottleneck of the performance improvement of these methods. In this paper, we proposed two new feature extraction methods called MotifCNN and MotifDCNN to extract more discriminative fold-specific features based on structural motif kernels to construct the motif-based convolutional neural networks (CNNs). The pairwise sequence similarity scores calculated based on fold-specific features are then fed into support vector machines to construct the predictor for fold recognition, and a predictor called MotifCNN-fold has been proposed. Experimental results on the benchmark dataset showed that MotifCNN-fold obviously outperformed all the other competing methods. In particular, the fold-specific features extracted by MotifCNN and MotifDCNN are more discriminative than the fold-specific features extracted by other deep learning techniques, indicating that incorporating the structural motifs into the CNN is able to capture the characteristics of protein folds.


2020 ◽  
Vol 21 (S6) ◽  
Author(s):  
Wessam Elhefnawy ◽  
Min Li ◽  
Jianxin Wang ◽  
Yaohang Li

Abstract Background One of the most essential problems in structural bioinformatics is protein fold recognition. In this paper, we design a novel deep learning architecture, so-called DeepFrag-k, which identifies fold discriminative features at fragment level to improve the accuracy of protein fold recognition. DeepFrag-k is composed of two stages: the first stage employs a multi-modal Deep Belief Network (DBN) to predict the potential structural fragments given a sequence, represented as a fragment vector, and then the second stage uses a deep convolutional neural network (CNN) to classify the fragment vector into the corresponding fold. Results Our results show that DeepFrag-k yields 92.98% accuracy in predicting the top-100 most popular fragments, which can be used to generate discriminative fragment feature vectors to improve protein fold recognition. Conclusions There is a set of fragments that can serve as structural “keywords” distinguishing between major protein folds. The deep learning architecture in DeepFrag-k is able to accurately identify these fragments as structure features to improve protein fold recognition.


2010 ◽  
pp. no-no
Author(s):  
Wiesław Chmielnicki ◽  
Irena Roterman-Konieczna ◽  
Katarzyna Stąpor

Sign in / Sign up

Export Citation Format

Share Document