A Classification-based Approach to Prediction of Dengue Virus and Human Protein-Protein Interactions using Amino Acid Composition and Conjoint Triad Features

Author(s):  
Lopamudra Dey ◽  
Anirban Mukhopadhyay
PLoS ONE ◽  
2009 ◽  
Vol 4 (11) ◽  
pp. e7813 ◽  
Author(s):  
Sushmita Roy ◽  
Diego Martinez ◽  
Harriett Platero ◽  
Terran Lane ◽  
Margaret Werner-Washburne

2020 ◽  
Author(s):  
Bin Yu ◽  
Cheng Chen ◽  
Zhaomin Yu ◽  
Anjun Ma ◽  
Bingqiang Liu ◽  
...  

AbstractPrediction of protein-protein interactions (PPIs) helps to grasp molecular roots of disease. However, web-lab experiments to predict PPIs are limited and costly. Using machine-learning-based frameworks can not only automatically identify PPIs, but also provide new ideas for drug research and development from a promising alternative. We present a novel deep-forest-based method for PPIs prediction. First, pseudo amino acid composition (PAAC), autocorrelation descriptor (Auto), multivariate mutual information (MMI), composition-transition-distribution (CTD), and amino acid composition PSSM (AAC-PSSM), and dipeptide composition PSSM (DPC-PSSM) are adopted to extract and construct the pattern of PPIs. Secondly, elastic net is utilized to optimize the initial feature vectors and boost the predictive performance. Finally, GcForest-PPI model based on deep forest is built up. Benchmark experiments reveal that the accuracy values of Saccharomyces cerevisiae and Helicobacter pylori are 95.44% and 89.26%. We also apply GcForest-PPI on independent test sets and CD9-core network, crossover network, and cancer-specific network. The evaluation shows that GcForest-PPI can boost the prediction accuracy, complement experiments and improve drug discovery. The datasets and code of GcForest-PPI could be downloaded at https://github.com/QUST-AIBBDRC/GcForest-PPI/.


2020 ◽  
Author(s):  
Lopamudra Dey ◽  
Sanjay Chakraborty ◽  
Anirban Mukhopadhyay

COVID-19 (Coronavirus Disease-19), a disease caused by the SARS-CoV-2 virus, has been declared as a pandemic by the World Health Organization on March 11, 2020. Over 4.3 million people from more than 200 countries have already been affected throughout the world by this deadly virus, resulting in almost 0.3 millions deaths. Protein-protein interactions (PPIs) play a key role in the cellular process of SARS-CoV-2 virus infection in the human body. Recently a study has reported some SARS-CoV-2 proteins that interact with a number of human proteins while many potential interactions still remain to be identified. However, human cells are composed of a large number of proteins. Therefore, it is not possible to experimentally check all possible combinations of interactions. This leads to development of various computational methods to predict the PPIs between the virus and human proteins and further validation of them using biological experiments. This paper presents a prediction model by combining the different sequence-based features of human proteins like the amino acid composition, pseudo amino acid composition, and the conjoint triad. We have built an ensemble voting classifier using $SVM^{Radial}$, $SVM^{Polynomial}$, and Random Forest technique which gives greater accuracy, precision, specificity, recall, and F1 score over all other models used in the work. We have predicted 1326 potential human target proteins using this weighted ensemble classifier. Furthermore, the Gene Ontology (GO) and KEGG pathway enrichments of these predicted human proteins are investigated. This study may encourage the identification of potential targets for more effective anti-COVID drug discovery.


Sign in / Sign up

Export Citation Format

Share Document