scholarly journals Protein Function Prediction with Deep Neural Learning

Author(s):  
Zihao Zhao ◽  
Hongwei Zhang ◽  
Minglei Hu ◽  
Ning Yang ◽  
Hui Wang ◽  
...  

Abstract Background: The function of protein is directly related to its structure, and plays a pivotal role in the entire life process. The protein interaction network controls almost all biological cell processes while fulfilling most of the biological functions. In fact, protein function prediction can be regarded as a multi-label classification problem to fill the gap between a huge number of protein sequences and known functions. It is not only a key issue in related research fields, but also a long-standing challenge. Protein function prediction with Deep Neural Network (DNN) almost study data set with small scale proteins based on Gene Ontology (GO). They usually dig relationships between protein features and function tags. It still needs further study for large-scale protein to find useful prediction approaches.Methods: This paper proposed a protein function prediction approach with DNN which used Grasshopper Optimization Algorithm (GOA), Intuitionistic Fuzzy c-Means (IFCM), Kernel Principal Component Analysis (KPCA) and DNN (IGP-DNN). The features in protein function modules were extracted by combining GOA and IFCM. The KPCA was used to reduce the dimensions of features in protein properties. Both features were integrated to enrich the features information and the integrated features were input into the DNN model. The protein function modules were classified to predict function by computing in hiding level of DNN.Results and conclusion: IGP-DNN combines the advantages of IFCM-GOA and DNN. The combination of IFCM and GOA not only avoids falling into local optimal when extracting function module feature and reduces the over-sensitivity of IFCM for clustering center, but also improves the precision of the protein function module feature extraction. This paper proposes a protein function prediction approach based on DNN. In the model, protein features are composed of the protein function module features that are extracted by using IFCM-GOA and the protein property features that are reduced dimensions by using KPCA to address the noise sensitivity and the other problems during predicting protein function.

2022 ◽  
Vol 19 (3) ◽  
pp. 2471-2488
Author(s):  
Wenjun Xu ◽  
◽  
Zihao Zhao ◽  
Hongwei Zhang ◽  
Minglei Hu ◽  
...  

<abstract> <p>It is vital for the annotation of uncharacterized proteins by protein function prediction. At present, Deep Neural Network based protein function prediction is mainly carried out for dataset of small scale proteins or Gene Ontology, and usually explore the relationships between single protein feature and function tags. The practical methods for large-scale multi-features protein prediction still need to be studied in depth. This paper proposes a DNN based protein function prediction approach IGP-DNN. This method uses Grasshopper Optimization Algorithm (GOA) and Intuitionistic Fuzzy c-Means clustering (IFCM) based protein function modules extracting algorithm to extract the features of protein modules, utilizing Kernel Principal Component Analysis (KPCA) method to reduce the dimensionality of the protein attribute information, and integrating module features and attribute features. Inputting integrated data into DNN through multiple hidden layers to classify proteins and predict protein functions. In the experiments, the F-measure value of IGP-DNN on the DIP dataset reaches 0.4436, which shows better performance.</p> </abstract>


2005 ◽  
Vol 15 (04) ◽  
pp. 259-275 ◽  
Author(s):  
ALI AL-SHAHIB ◽  
RAINER BREITLING ◽  
DAVID GILBERT

In the study of in silico functional genomics, improving the performance of protein function prediction is the ultimate goal for identifying proteins associated with defined cellular functions. The classical prediction approach is to employ pairwise sequence alignments. However this method often faces difficulties when no statistically significant homologous sequences are identified. An alternative way is to predict protein function from sequence-derived features using machine learning. In this case the choice of possible features which can be derived from the sequence is of vital importance to ensure adequate discrimination to predict function. In this paper we have successfully selected biologically significant features for protein function prediction. This was performed using a new feature selection method (FrankSum) that avoids data distribution assumptions, uses a data independent measurement (p-value) within the feature, identifies redundancy between features and uses an appropiate ranking criterion for feature selection. We have shown that classifiers generated from features selected by FrankSum outperforms classifiers generated from full feature sets, randomly selected features and features selected from the Wrapper method. We have also shown the features are concordant across all species and top ranking features are biologically informative. We conclude that feature selection is vital for successful protein function prediction and FrankSum is one of the feature selection methods that can be applied successfully to such a domain.


Molecules ◽  
2017 ◽  
Vol 22 (10) ◽  
pp. 1732 ◽  
Author(s):  
Renzhi Cao ◽  
Colton Freitas ◽  
Leong Chan ◽  
Miao Sun ◽  
Haiqing Jiang ◽  
...  

2008 ◽  
Vol 9 (1) ◽  
pp. 350 ◽  
Author(s):  
Xiaoyu Jiang ◽  
Naoki Nariai ◽  
Martin Steffen ◽  
Simon Kasif ◽  
Eric D Kolaczyk

Amino Acids ◽  
2008 ◽  
Vol 35 (3) ◽  
pp. 517-530 ◽  
Author(s):  
Xing-Ming Zhao ◽  
Luonan Chen ◽  
Kazuyuki Aihara

Sign in / Sign up

Export Citation Format

Share Document