Prediction of Protein Interactions in Rice and Blast Fungus Using Machine Learning

Objective Colorectal cancer (CRC) is the most common cancer worldwide. Patient outcomes following recurrence of CRC are very poor. Therefore, identifying the risk of CRC recurrence at an early stage would improve patient care. Accumulating evidence shows that autophagy plays an active role in tumorigenesis, recurrence, and metastasis. Methods We used machine learning algorithms and two regression models, univariable Cox proportion and least absolute shrinkage and selection operator (LASSO), to identify 26 autophagy-related genes (ARGs) related to CRC recurrence. Results By functional annotation, these ARGs were shown to be enriched in necroptosis and apoptosis pathways. Protein–protein interactions identified SQSTM1, CASP8, HSP80AB1, FADD, and MAPK9 as core genes in CRC autophagy. Of 26 ARGs, BAX and PARP1 were regarded as having the most significant predictive ability of CRC recurrence, with prediction accuracy of 71.1%. Conclusion These results shed light on prediction of CRC recurrence by ARGs. Stratification of patients into recurrence risk groups by testing ARGs would be a valuable tool for early detection of CRC recurrence.

Download Full-text

Computational Prediction of lncRNA-Protein Interactions using Machine learning

10.1109/embc46164.2021.9630282 ◽

2021 ◽

Author(s):

Muhammad Mushtaq ◽

Hammad Naveed ◽

Zoya Khalid

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Computational Prediction

Download Full-text

Deciphering the interactions of SARS-CoV-2 proteins with human ion channels using machine learning-based method

10.21203/rs.3.rs-622770/v1 ◽

2021 ◽

Author(s):

Nupur S. Munjal ◽

Dikscha Sapra ◽

Abhishek Goyal ◽

K.T. Shreya Parthasarathi ◽

Akhilesh Pandey ◽

...

Keyword(s):

Machine Learning ◽

Ion Channels ◽

Protein Interactions ◽

Transient Receptor Potential ◽

Trp Channels ◽

Transmission Rate ◽

Drug Repurposing ◽

Receptor Potential ◽

Ppi Networks ◽

Approved Drugs

Abstract Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the worldwide COVID-19 pandemic which began in 2019. It has a high transmission rate and pathogenicity leading to health emergencies and economic crisis. Recent studies pertaining to the understanding of the molecular pathogenesis of SARS-CoV-2 infection exhibited the indispensable role of ion channels in viral infection inside the host. Moreover, machine learning-based algorithms are providing higher accuracy for host-SARS-CoV-2 protein-protein interactions (PPIs). In this study, predictions of PPIs of SARS-CoV-2 proteins with human ion channels (HICs) were performed using PPI-MetaGO algorithm. The PPIs were predicted with 82.71% accuracy, 84.09% precision, 84.09% sensitivity, 0.89 AUC-ROC, 65.17% MCC score and 84.09% F1 score. Thereafter, PPI networks of SARS-CoV-2 proteins with HICs were generated. Furthermore, biological pathway analysis of HICs interacting with SARS-CoV-2 proteins showed the involvement of six pathways, namely inflammatory mediator regulation of TRP channels, insulin secretion, renin secretion, gap junction, taste transduction and apelin signaling pathway. The inositol 1,4,5-trisphosphate receptor 1 (ITPR1) and transient receptor potential cation channel subfamily A member 1 (TRPA1) were identified as potential target proteins. Various FDA approved drugs interacting with ITPR1 and TRPA1 are also available. It is anticipated that targeting ITPR1 and TRPA1 may provide a better therapeutic management of infection caused by SARS-CoV-2. The study also reinforces the drug repurposing approach for the development of host directed antiviral drugs.

Download Full-text

More challenges for machine-learning protein interactions

Bioinformatics ◽

10.1093/bioinformatics/btu857 ◽

2015 ◽

Vol 31 (10) ◽

pp. 1521-1525 ◽

Cited By ~ 27

Author(s):

Tobias Hamp ◽

Burkhard Rost

Keyword(s):

Machine Learning ◽

Protein Interactions

Download Full-text

Prediction of Compound-Protein Interactions with Machine Learning Methods

Chemoinformatics and Advanced Machine Learning Perspectives ◽

10.4018/978-1-61520-911-8.ch016 ◽

2011 ◽

pp. 304-317

Author(s):

Yoshihiro Yamanishi ◽

Hisashi Kashima

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Chemical Structure ◽

Genomic Sequence ◽

Sequence Data ◽

Binary Classification ◽

Biological Data ◽

Supervised Machine Learning ◽

Learning Methods ◽

Machine Learning Methods

In silico prediction of compound-protein interactions from heterogeneous biological data is critical in the process of drug development. In this chapter the authors review several supervised machine learning methods to predict unknown compound-protein interactions from chemical structure and genomic sequence information simultaneously. The authors review several kernel-based algorithms from two different viewpoints: binary classification and dimension reduction. In the results, they demonstrate the usefulness of the methods on the prediction of drug-target interactions and ligand-protein interactions from chemical structure data and genomic sequence data.

Download Full-text

Targeting Virus-host Protein Interactions: Feature Extraction and Machine Learning Approaches

Current Drug Metabolism ◽

10.2174/1389200219666180829121038 ◽

2019 ◽

Vol 20 (3) ◽

pp. 177-184 ◽

Cited By ~ 16

Author(s):

Nantao Zheng ◽

Kairou Wang ◽

Weihua Zhan ◽

Lei Deng

Keyword(s):

Machine Learning ◽

Computational Methods ◽

Protein Interactions ◽

Prediction Models ◽

Learning Algorithms ◽

Biological Data ◽

Machine Learning Algorithms ◽

Host Protein ◽

Protein Protein Interactions ◽

Protein Motifs

Background:Targeting critical viral-host Protein-Protein Interactions (PPIs) has enormous application prospects for therapeutics. Using experimental methods to evaluate all possible virus-host PPIs is labor-intensive and time-consuming. Recent growth in computational identification of virus-host PPIs provides new opportunities for gaining biological insights, including applications in disease control. We provide an overview of recent computational approaches for studying virus-host PPI interactions.Methods:In this review, a variety of computational methods for virus-host PPIs prediction have been surveyed. These methods are categorized based on the features they utilize and different machine learning algorithms including classical and novel methods.Results:We describe the pivotal and representative features extracted from relevant sources of biological data, mainly include sequence signatures, known domain interactions, protein motifs and protein structure information. We focus on state-of-the-art machine learning algorithms that are used to build binary prediction models for the classification of virus-host protein pairs and discuss their abilities, weakness and future directions.Conclusion:The findings of this review confirm the importance of computational methods for finding the potential protein-protein interactions between virus and host. Although there has been significant progress in the prediction of virus-host PPIs in recent years, there is a lot of room for improvement in virus-host PPI prediction.

Download Full-text

Evaluation of Machine Learning Algorithms on Protein-Protein Interactions

Advances in Intelligent Systems and Computing - Man-Machine Interactions 3 ◽

10.1007/978-3-319-02309-0_22 ◽

2014 ◽

pp. 211-218 ◽

Cited By ~ 1

Author(s):

Indrajit Saha ◽

Tomas Klingström ◽

Simon Forsberg ◽

Johan Wikander ◽

Julian Zubek ◽

...

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Protein Protein Interactions

Download Full-text

Recent Advances in Machine Learning Based Prediction of RNA-protein Interactions

Protein and Peptide Letters ◽

10.2174/0929866526666190619103853 ◽

2019 ◽

Vol 26 (8) ◽

pp. 601-619 ◽

Cited By ~ 1

Author(s):

Amit Sagar ◽

Bin Xue

Keyword(s):

Machine Learning ◽

Protein Interaction ◽

Protein Interactions ◽

Experimental Methods ◽

Machine Learning Techniques ◽

Computational Techniques ◽

Biological Processes ◽

Complete Spectrum ◽

Future Developments ◽

Learning Techniques

The interactions between RNAs and proteins play critical roles in many biological processes. Therefore, characterizing these interactions becomes critical for mechanistic, biomedical, and clinical studies. Many experimental methods can be used to determine RNA-protein interactions in multiple aspects. However, due to the facts that RNA-protein interactions are tissuespecific and condition-specific, as well as these interactions are weak and frequently compete with each other, those experimental techniques can not be made full use of to discover the complete spectrum of RNA-protein interactions. To moderate these issues, continuous efforts have been devoted to developing high quality computational techniques to study the interactions between RNAs and proteins. Many important progresses have been achieved with the application of novel techniques and strategies, such as machine learning techniques. Especially, with the development and application of CLIP techniques, more and more experimental data on RNA-protein interaction under specific biological conditions are available. These CLIP data altogether provide a rich source for developing advanced machine learning predictors. In this review, recent progresses on computational predictors for RNA-protein interaction were summarized in the following aspects: dataset, prediction strategies, and input features. Possible future developments were also discussed at the end of the review.

Download Full-text

Issues in performance evaluation for host–pathogen protein interaction prediction

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720016500116 ◽

2016 ◽

Vol 14 (03) ◽

pp. 1650011 ◽

Cited By ~ 9

Author(s):

Wajid Arshad Abbasi ◽

Fayyaz Ul Amir Afsar Minhas

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Cross Validation ◽

Protein Protein Interactions ◽

Evaluation Scheme ◽

Host Pathogen ◽

Pathogen Protein ◽

Protein Interaction Prediction ◽

Underlying Mechanisms ◽

Fold Cross Validation

The study of interactions between host and pathogen proteins is important for understanding the underlying mechanisms of infectious diseases and for developing novel therapeutic solutions. Wet-lab techniques for detecting protein–protein interactions (PPIs) can benefit from computational predictions. Machine learning is one of the computational approaches that can assist biologists by predicting promising PPIs. A number of machine learning based methods for predicting host–pathogen interactions (HPI) have been proposed in the literature. The techniques used for assessing the accuracy of such predictors are of critical importance in this domain. In this paper, we question the effectiveness of K-fold cross-validation for estimating the generalization ability of HPI prediction for proteins with no known interactions. K-fold cross-validation does not model this scenario, and we demonstrate a sizable difference between its performance and the performance of an alternative evaluation scheme called leave one pathogen protein out (LOPO) cross-validation. LOPO is more effective in modeling the real world use of HPI predictors, specifically for cases in which no information about the interacting partners of a pathogen protein is available during training. We also point out that currently used metrics such as areas under the precision-recall or receiver operating characteristic curves are not intuitive to biologists and propose simpler and more directly interpretable metrics for this purpose.

Download Full-text

Systematic auditing is essential to debiasing machine learning in biology

10.1101/2020.05.08.085183 ◽

2020 ◽

Cited By ~ 1

Author(s):

Fatma-Elzahraa Eid ◽

Haitham Elmarakeby ◽

Yujia Alina Chan ◽

Nadine Fornelos Martins ◽

Mahmoud ElHefnawi ◽

...

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Drug Target ◽

Peptide Binding ◽

Life Sciences ◽

Prediction Performance ◽

Biological Data ◽

Training Data ◽

Protein Protein Interactions ◽

Interest Prediction

AbstractRepresentational biases that are common in biological data can inflate prediction performance and confound our understanding of how and what machine learning (ML) models learn from large complicated datasets. However, auditing for these biases is not a common practice in ML in the life sciences. Here, we devise a systematic auditing framework and harness it to audit three different ML applications of significant therapeutic interest: prediction frameworks of protein-protein interactions, drug-target bioactivity, and MHC-peptide binding. Through this, we identify unrecognized biases that hinder the ML process and result in low model generalizability. Ultimately, we show that, when there is insufficient signal in the training data, ML models are likely to learn primarily from representational biases.

Download Full-text