Computational Prediction of lncRNA-Protein Interactions using Machine learning

Author(s):  
Muhammad Mushtaq ◽  
Hammad Naveed ◽  
Zoya Khalid
2021 ◽  
Vol 16 ◽  
Author(s):  
Fee Faysal Ahmed ◽  
Mst Shamima Khatun ◽  
Md. Parvez Mosharaf ◽  
Md. Nurul Haque Mollah

Background: Protein-protein interactions (PPI) play a vital role in a wide range of biological processes starting from cell-cell interactions to developmental control in all organisms. However, experimental identification of PPI is often laborious, time-consuming and costly compared to computational prediction. There are several computational prediction models in the literature based on complete training samples, but none of them dealt with the partial training samples. Objective: The objective of this work was to develop an effective PPI prediction model for Arabidopsis Thaliana using partial training samples in a machine learning framework. Methods: We proposed an effective computational PPI prediction model by combining random forest (RF) classifier and autocorrelation (AC) sequence encoding features with 1:2 ratio of positive-PPI and unknown-PPI samples. Results: We observed that the proposed prediction model produces the highest average performance scores of sensitivity (94.62%), AUC (0.92) and pAUC (0.189) with the training datasets and sensitivity (88.14%), AUC (0.89) and pAUC (0.176) with the test datasets of 5-fold cross-validation compared to other candidate predictors based on LDA, LOGI, ADA, NB, KNN & SVM classifiers. It also computed the highest performance scores of TPR (91.82%) and pAUC (0.174) at FPR= 20% with AUC (0.948) compared to other candidate predictors. Conclusion: Overall performance of the developed model revealed that our proposed predictor might be useful to elucidate the biological function of unseen PPIs from a large number of candidate proteins in Arabidopsis thaliana.


2017 ◽  
Author(s):  
Khalid Raza

AbstractThe long awaited challenge of post-genomic era and systems biology research is computational prediction of protein-protein interactions (PPIs) that ultimately lead to protein functions prediction. The important research questions is how protein complexes with known sequence and structure be used to identify and classify protein binding sites, and how to infer knowledge from these classification such as predicting PPIs of proteins with unknown sequence and structure. Several machine learning techniques have been applied for the prediction of PPIs, but the accuracy of their prediction wholly depends on the number of features being used for training. In this paper, we have performed a survey of protein features used for the prediction of PPIs. The open research challenges and opportunities in the area have also been discussed.


2016 ◽  
Vol 12 (3) ◽  
pp. 778-785 ◽  
Author(s):  
A. Srivastava ◽  
G. Mazzocco ◽  
A. Kel ◽  
L. S. Wyrwicz ◽  
D. Plewczynski

Protein–protein interactions (PPIs) play a vital role in most biological processes.


2020 ◽  
Vol 27 (5) ◽  
pp. 385-391
Author(s):  
Lin Zhong ◽  
Zhong Ming ◽  
Guobo Xie ◽  
Chunlong Fan ◽  
Xue Piao

: In recent years, more and more evidence indicates that long non-coding RNA (lncRNA) plays a significant role in the development of complex biological processes, especially in RNA progressing, chromatin modification, and cell differentiation, as well as many other processes. Surprisingly, lncRNA has an inseparable relationship with human diseases such as cancer. Therefore, only by knowing more about the function of lncRNA can we better solve the problems of human diseases. However, lncRNAs need to bind to proteins to perform their biomedical functions. So we can reveal the lncRNA function by studying the relationship between lncRNA and protein. But due to the limitations of traditional experiments, researchers often use computational prediction models to predict lncRNA protein interactions. In this review, we summarize several computational models of the lncRNA protein interactions prediction base on semi-supervised learning during the past two years, and introduce their advantages and shortcomings briefly. Finally, the future research directions of lncRNA protein interaction prediction are pointed out.


2021 ◽  
Vol 22 (5) ◽  
pp. 2704
Author(s):  
Andi Nur Nilamyani ◽  
Firda Nurul Auliah ◽  
Mohammad Ali Moni ◽  
Watshara Shoombuatong ◽  
Md Mehedi Hasan ◽  
...  

Nitrotyrosine, which is generated by numerous reactive nitrogen species, is a type of protein post-translational modification. Identification of site-specific nitration modification on tyrosine is a prerequisite to understanding the molecular function of nitrated proteins. Thanks to the progress of machine learning, computational prediction can play a vital role before the biological experimentation. Herein, we developed a computational predictor PredNTS by integrating multiple sequence features including K-mer, composition of k-spaced amino acid pairs (CKSAAP), AAindex, and binary encoding schemes. The important features were selected by the recursive feature elimination approach using a random forest classifier. Finally, we linearly combined the successive random forest (RF) probability scores generated by the different, single encoding-employing RF models. The resultant PredNTS predictor achieved an area under a curve (AUC) of 0.910 using five-fold cross validation. It outperformed the existing predictors on a comprehensive and independent dataset. Furthermore, we investigated several machine learning algorithms to demonstrate the superiority of the employed RF algorithm. The PredNTS is a useful computational resource for the prediction of nitrotyrosine sites. The web-application with the curated datasets of the PredNTS is publicly available.


2020 ◽  
Vol 48 (10) ◽  
pp. 030006052095880
Author(s):  
Jianping Wu ◽  
Sulai Liu ◽  
Xiaoming Chen ◽  
Hongfei Xu ◽  
Yaoping Tang

Objective Colorectal cancer (CRC) is the most common cancer worldwide. Patient outcomes following recurrence of CRC are very poor. Therefore, identifying the risk of CRC recurrence at an early stage would improve patient care. Accumulating evidence shows that autophagy plays an active role in tumorigenesis, recurrence, and metastasis. Methods We used machine learning algorithms and two regression models, univariable Cox proportion and least absolute shrinkage and selection operator (LASSO), to identify 26 autophagy-related genes (ARGs) related to CRC recurrence. Results By functional annotation, these ARGs were shown to be enriched in necroptosis and apoptosis pathways. Protein–protein interactions identified SQSTM1, CASP8, HSP80AB1, FADD, and MAPK9 as core genes in CRC autophagy. Of 26 ARGs, BAX and PARP1 were regarded as having the most significant predictive ability of CRC recurrence, with prediction accuracy of 71.1%. Conclusion These results shed light on prediction of CRC recurrence by ARGs. Stratification of patients into recurrence risk groups by testing ARGs would be a valuable tool for early detection of CRC recurrence.


2018 ◽  
Vol 20 (6) ◽  
pp. 2066-2087 ◽  
Author(s):  
Chen Wang ◽  
Lukasz Kurgan

AbstractDrug–protein interactions (DPIs) underlie the desired therapeutic actions and the adverse side effects of a significant majority of drugs. Computational prediction of DPIs facilitates research in drug discovery, characterization and repurposing. Similarity-based methods that do not require knowledge of protein structures are particularly suitable for druggable genome-wide predictions of DPIs. We review 35 high-impact similarity-based predictors that were published in the past decade. We group them based on three types of similarities and their combinations that they use. We discuss and compare key aspects of these methods including source databases, internal databases and their predictive models. Using our novel benchmark database, we perform comparative empirical analysis of predictive performance of seven types of representative predictors that utilize each type of similarity individually and all possible combinations of similarities. We assess predictive quality at the database-wide DPI level and we are the first to also include evaluation over individual drugs. Our comprehensive analysis shows that predictors that use more similarity types outperform methods that employ fewer similarities, and that the model combining all three types of similarities secures area under the receiver operating characteristic curve of 0.93. We offer a comprehensive analysis of sensitivity of predictive performance to intrinsic and extrinsic characteristics of the considered predictors. We find that predictive performance is sensitive to low levels of similarities between sequences of the drug targets and several extrinsic properties of the input drug structures, drug profiles and drug targets. The benchmark database and a webserver for the seven predictors are freely available at http://biomine.cs.vcu.edu/servers/CONNECTOR/.


2018 ◽  
Vol 19 (S14) ◽  
Author(s):  
Diogo Manuel Carvalho Leite ◽  
Xavier Brochet ◽  
Grégory Resch ◽  
Yok-Ai Que ◽  
Aitana Neves ◽  
...  

2021 ◽  
Author(s):  
Nupur S. Munjal ◽  
Dikscha Sapra ◽  
Abhishek Goyal ◽  
K.T. Shreya Parthasarathi ◽  
Akhilesh Pandey ◽  
...  

Abstract Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the worldwide COVID-19 pandemic which began in 2019. It has a high transmission rate and pathogenicity leading to health emergencies and economic crisis. Recent studies pertaining to the understanding of the molecular pathogenesis of SARS-CoV-2 infection exhibited the indispensable role of ion channels in viral infection inside the host. Moreover, machine learning-based algorithms are providing higher accuracy for host-SARS-CoV-2 protein-protein interactions (PPIs). In this study, predictions of PPIs of SARS-CoV-2 proteins with human ion channels (HICs) were performed using PPI-MetaGO algorithm. The PPIs were predicted with 82.71% accuracy, 84.09% precision, 84.09% sensitivity, 0.89 AUC-ROC, 65.17% MCC score and 84.09% F1 score. Thereafter, PPI networks of SARS-CoV-2 proteins with HICs were generated. Furthermore, biological pathway analysis of HICs interacting with SARS-CoV-2 proteins showed the involvement of six pathways, namely inflammatory mediator regulation of TRP channels, insulin secretion, renin secretion, gap junction, taste transduction and apelin signaling pathway. The inositol 1,4,5-trisphosphate receptor 1 (ITPR1) and transient receptor potential cation channel subfamily A member 1 (TRPA1) were identified as potential target proteins. Various FDA approved drugs interacting with ITPR1 and TRPA1 are also available. It is anticipated that targeting ITPR1 and TRPA1 may provide a better therapeutic management of infection caused by SARS-CoV-2. The study also reinforces the drug repurposing approach for the development of host directed antiviral drugs.


Sign in / Sign up

Export Citation Format

Share Document